Thursday, September 30, 2010

JPA Demystified (episode 2) - Persisting entities

Let's go back to the example described in previous post - JPA Demystified (episode 1). We will take a deeper look into cascading of basic operations on entities.

We will stick with MySQL (version 5.1) as the database used for our tests. To be able to create the new records in the database, we have to adjust the table definitions from JPA Demystified (episode 1) a little, to allow usage of MySQL's AUTO_INCREMENT feature (see the end of this post).

Let's start with persisting the entity, for example Employer, with one Employee having one Benefit. This way we should trigger inserts into 3 database tables (each entity is stored in separate table).

We will create the entities for our tests in following manner:
    Employer employer = new Employer();  
    // ... set the employer properties ...  
    Employee employee = new Employee();  
    employee.setEmployer(employer);  
    // ... set the employee properties ...   
    Benefit benefit = new Benefit();  
    benefit.setEmployee(employee);  
    // ... set the benefit properties ...
    employee.setBenefits(Arrays.asList(benefit));
    employer.setEmployees(Arrays.asList(employee));
and try to persist it using following DAO:
public class DefaultEmployerDAO implements EmployerDAO {

    @PersistenceContext
    private EntityManager entityManager;

    public void persist(Employer employer) {
        entityManager.persist(employer);
    }
    ...
}
When you look at the tables definitions, you realize that we will need the primary key value to be set for each entity stored into database. Entity identifiers can be set by hand, if you want the full control over it, or generated automatically, which can be achieved using @GeneratedValue annotation. For example:
    @Id
    @GeneratedValue
    public Long getId() {
        return id;
    }
You have to be aware that default strategy for the primary key generation depends on the JPA provider and database used. For MySQL it can be AUTO_INCREMENT usage (Hibernate) or sequence table (EclipseLink, OpenJPA). As you see using default value isn't as consistent across the JPA providers as we would like it to be. Let's make it more predictable ;)
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    public Long getId() {
        return id;
    }
Having the above on each entity identifier will force all three providers to use the MySQL's AUTO_INCREMENT feature for generating primary keys.

You shouldn't smile yet ;) This is not the end of our story ... When you try to run persist method for the Employer now, you will encounter another inconsistency between the JPA providers. Only Hibernate is writing anything into database (into EMPLOYER table only), while both EclipseLink and OpenJPA complaints about the problems with not persisted objects bound to Employer. Why? - because we didn't used the cascade attribute for any of @OneToMany annotations yet, and thus all entities related to the Employer will not be persisted automatically.
We have to change the correlation between entities for Employer
    @OneToMany(mappedBy = "employer", cascade = CascadeType.PERSIST)
    public List getEmployees() {
        return employees;
    }
and for the Employee
    @OneToMany(mappedBy = "employee", cascade = CascadeType.PERSIST)    
    public List getBenefits() {
        return benefits;
    }
and now we can smile watching the records being written into database
-- Hibernate 3.4.0.GA
insert into EMPLOYERS (BUSINESS_NAME) values ('Business Name')
insert into EMPLOYEES (EMPLOYER_ID, FIRST_NAME) values (7, 'Jack')
insert into BENEFITS (EMPLOYEE_ID, END_DATE, name, START_DATE, BENEFIT_TYPE) values (2, '', 'Car for everyone!', '2010-09-26', 'TRANSPORTATION')

-- EclipseLink 2.1.1
INSERT INTO EMPLOYERS (BUSINESS_NAME) VALUES ('Business Name')
SELECT LAST_INSERT_ID()
INSERT INTO EMPLOYEES (FIRST_NAME, EMPLOYER_ID) VALUES ('Jack', 8)
SELECT LAST_INSERT_ID()
INSERT INTO BENEFITS (START_DATE, NAME, END_DATE, BENEFIT_TYPE, EMPLOYEE_ID) VALUES ('2010-09-26', 'Car for everyone!', '', 'TRANSPORTATION', 3)
SELECT LAST_INSERT_ID()

-- OpenJPA 2.0.1
INSERT INTO EMPLOYERS (BUSINESS_NAME) VALUES ('Business Name')
INSERT INTO EMPLOYEES (FIRST_NAME, EMPLOYER_ID) VALUES ('Jack', 9)
INSERT INTO BENEFITS (END_DATE, name, START_DATE, BENEFIT_TYPE, EMPLOYEE_ID) VALUES ('', 'Car for everyone!', '2010-09-26', 'TRANSPORTATION', 4)
As you see in this post, persist will store your entity into database, generating primary keys if requested, and updating the entities with the generated keys. Persisting entities is cascaded according to the cascade attribute of annotations describing correlations between entities. This attribute holds an array of cascade types, if one of the array elements is CascadeType.ALL or CascadeType.PERSIST the cascading will occur.

And as usual - the most boring part for the dessert :)

CREATE TABLE BENEFITS (  
   ID           INT UNSIGNED NOT NULL AUTO_INCREMENT,  
   EMPLOYEE_ID  INT UNSIGNED NOT NULL,  
   BENEFIT_TYPE VARCHAR(64) NOT NULL,  
   START_DATE   DATE,  
   END_DATE     DATE,  
   NAME         VARCHAR(128),   
   CONSTRAINT BENEFITS_PK PRIMARY KEY (ID)  
 );  
   
 CREATE TABLE EMPLOYEES (  
   ID          INT UNSIGNED NOT NULL AUTO_INCREMENT,  
   EMPLOYER_ID INT UNSIGNED NOT NULL,   
   FIRST_NAME  VARCHAR(64),   
   CONSTRAINT EMPLOYEES_PK PRIMARY KEY (ID)  
 );  
   
 CREATE TABLE EMPLOYERS (  
   ID            INT UNSIGNED NOT NULL AUTO_INCREMENT,   
   BUSINESS_NAME VARCHAR(128),   
   CONSTRAINT EMPLOYERS_PK PRIMARY KEY (ID)  
 );     

Nice traffic map for Poland

Interesting traffic map for Poland http://mapa.targeo.pl. I'm curious how precise it is, and how often updated ...

Saturday, September 25, 2010

JPA Demystified (episode 1) - @OneToMany and @ManyToOne mappings

Let's play with some @OneToMany and @ManyToOne mappings :) As you may read in wise books or boring javadocs, they serve for mapping 1-* and *-1 relations respectively.

Let's dig a little more into it using following example: Employer (1-*) Employee (1-*) Benefit (See below for DB tables and their content). Entity model for this situation will look like this:
@Entity
@Table(name = "EMPLOYERS")
public class Employer implements Serializable {

    private String businessName;
    private List<Employee> employees;
    private Long id;

    @Column(name = "BUSINESS_NAME")
    public String getBusinessName() {
        return businessName;
    }

    @OneToMany(mappedBy = "employer")
    public List<Employee> getEmployees() {
        return employees;
    }

    @Id
    public Long getId() {
        return id;
    }

...
}

@Entity
@Table(name = "EMPLOYEES")
public class Employee implements Serializable {

    private List<Benefit> benefits;
    private Employer employer;
    private String firstName;
    private Long id;

    @OneToMany(mappedBy = "employee")
    public List<Benefit> getBenefits() {
        return benefits;
    }

    @ManyToOne
    @JoinColumn(name = "EMPLOYER_ID")
    public Employer getEmployer() {
        return employer;
    }

    @Column(name = "FIRST_NAME")
    public String getFirstName() {
        return firstName;
    }

    @Id
    public Long getId() {
        return id;
    }
...
}

@Entity
@Table(name = "BENEFITS")
public class Benefit implements Serializable {

    private Employee employee;
    private Date endDate;
    private Long id;
    private String name;
    private Date startDate;
    private BenefitType type;

    @ManyToOne
    @JoinColumn(name = "EMPLOYEE_ID")
    public Employee getEmployee() {
        return employee;
    }

    @Column(name = "END_DATE")
    @Temporal(TemporalType.DATE)
    public Date getEndDate() {
        return endDate;
    }

    @Id
    public Long getId() {
        return id;
    }

    @Basic
    public String getName() {
        return name;
    }

    @Column(name = "START_DATE")
    @Temporal(TemporalType.DATE)
    public Date getStartDate() {
        return startDate;
    }

    @Column(name = "BENEFIT_TYPE")
    @Enumerated(EnumType.STRING)
    public BenefitType getType() {
        return type;
    }
...
}
Time for real fun :) - let's query the DB using following DAO:
public class DefaultEmployeeDAO implements EmployeeDAO {

    @PersistenceContext
    private EntityManager entityManager;

    public Employee get(Long id) {
        return entityManager.find(Employee.class, id);
    }
...
}
Executed SQL queries depend on used JPA provider:
-- Hibernate 3.4.0.GA
select employee0_.id as id1_1_, employee0_.EMPLOYER_ID as EMPLOYER3_1_1_, employee0_.FIRST_NAME as FIRST2_1_1_, employer1_.id as id2_0_, employer1_.BUSINESS_NAME as BUSINESS2_2_0_ 
from EMPLOYEES employee0_ left outer join EMPLOYERS employer1_ on employee0_.EMPLOYER_ID=employer1_.id 
where employee0_.id = 1 

-- EclipseLink 2.1.1
SELECT ID, FIRST_NAME, EMPLOYER_ID FROM EMPLOYEES WHERE (ID = 1)
SELECT ID, BUSINESS_NAME FROM EMPLOYERS WHERE (ID = 1) 

-- OpenJPA 2.0.1
SELECT t1.id, t1.BUSINESS_NAME, t0.FIRST_NAME 
FROM EMPLOYEES t0 LEFT OUTER JOIN EMPLOYERS t1 ON t0.EMPLOYER_ID = t1.id 
WHERE t0.id = 1 

As you see @ManyToOne mapping causes fetching the referenced entity along with the referencing one (fetch type EAGER), while fetching entity mapped with @OneToMany is deferred until it will be accessed for the first time (fetch type LAZY).

Take a deeper look at the queries generated by EclipseLink. What the heck!? Why they don't use joins?! And what we can do about it?

This is the situation where @JoinFetch comes to the rescue - but be aware - this annotation is EclipseLink addition to JPA and hence makes your mappings not portable across the different providers! How can it be used? see example below:
    @ManyToOne
    @JoinColumn(name = "EMPLOYER_ID")
    @JoinFetch(JoinFetchType.OUTER)
    public Employer getEmployer() {
        return employer;
    }
After adding the @JoinFetch your EclipseLink SQL queries will turn into:
SELECT t1.ID, t1.FIRST_NAME, t1.EMPLOYER_ID, t0.ID, t0.BUSINESS_NAME FROM EMPLOYEES t1 LEFT OUTER JOIN EMPLOYERS t0 ON (t0.ID = t1.EMPLOYER_ID) WHERE (t1.ID = 1)

Let's back to the @OneToMany annotation for a while, and check how the providers will behave when we change the default fetch type to EAGER, in following way:
    @OneToMany(mappedBy = "employee", fetch = FetchType.EAGER)
    public List getBenefits() {
        return benefits;
    }
This time the SQL queries are:
-- Hibernate 3.4.0.GA
select employee0_.id as id1_2_, employee0_.EMPLOYER_ID as EMPLOYER3_1_2_, employee0_.FIRST_NAME as FIRST2_1_2_, benefits1_.EMPLOYEE_ID as EMPLOYEE6_4_, benefits1_.id as id4_, benefits1_.id as id0_0_, benefits1_.EMPLOYEE_ID as EMPLOYEE6_0_0_, benefits1_.END_DATE as END2_0_0_, benefits1_.name as name0_0_, benefits1_.START_DATE as START4_0_0_, benefits1_.BENEFIT_TYPE as BENEFIT5_0_0_, employer2_.id as id2_1_, employer2_.BUSINESS_NAME as BUSINESS2_2_1_ 
from EMPLOYEES employee0_ left outer join BENEFITS benefits1_ on employee0_.id=benefits1_.EMPLOYEE_ID left outer join EMPLOYERS employer2_ on employee0_.EMPLOYER_ID=employer2_.id 
where employee0_.id=1

-- EclipseLink 2.1.1 (without @JoinFetch)
SELECT ID, FIRST_NAME, EMPLOYER_ID FROM EMPLOYEES WHERE (ID = 1)
SELECT ID, BUSINESS_NAME FROM EMPLOYERS WHERE (ID = 1)
SELECT ID, START_DATE, NAME, END_DATE, BENEFIT_TYPE, EMPLOYEE_ID FROM BENEFITS WHERE (EMPLOYEE_ID = 1)

-- OpenJPA 2.0.1
SELECT t1.id, t1.BUSINESS_NAME, t0.FIRST_NAME, t2.EMPLOYEE_ID, t2.id, t2.END_DATE, t2.name, t2.START_DATE, t2.BENEFIT_TYPE 
FROM EMPLOYEES t0 LEFT OUTER JOIN EMPLOYERS t1 ON t0.EMPLOYER_ID = t1.id LEFT OUTER JOIN BENEFITS t2 ON t0.id = t2.EMPLOYEE_ID 
WHERE t0.id = 1 ORDER BY t2.EMPLOYEE_ID ASC
Both Hibernate and OpenJPA generate impressive queries :) - but in fact they are using the most efficient way to fetch all correlated entities.
Quick note for OpenJPA users - the above example leads to an exception while processing the results:
<openjpa-2.0.1-r422266:989424 nonfatal general error> org.apache.openjpa.persistence.PersistenceException: null
FailedObject: ... .entities.domain.Employee-1 [org.apache.openjpa.util.LongId] [java.lang.String]
 at org.apache.openjpa.kernel.BrokerImpl.find(BrokerImpl.java:986)
 at org.apache.openjpa.kernel.BrokerImpl.find(BrokerImpl.java:880)
 at org.apache.openjpa.kernel.DelegatingBroker.find(DelegatingBroker.java:223)
 at org.apache.openjpa.persistence.EntityManagerImpl.find(EntityManagerImpl.java:477)
    ...
Caused by: java.lang.NullPointerException
 at org.apache.openjpa.jdbc.kernel.JDBCStoreManager.setInverseRelation(JDBCStoreManager.java:469)
 at org.apache.openjpa.jdbc.kernel.JDBCStoreManager.initializeState(JDBCStoreManager.java:429)
 at org.apache.openjpa.jdbc.kernel.JDBCStoreManager.initialize(JDBCStoreManager.java:322)
It looks like an error in OpenJPA.

Back to mainstream :) - we can narrow the number of queries generated by EclipseLink using @JoinFetch again:
    @OneToMany(mappedBy = "employee", fetch = FetchType.EAGER)
    @JoinFetch(JoinFetchType.OUTER)
    public List getBenefits() {
        return benefits;
    }

    @ManyToOne
    @JoinColumn(name = "EMPLOYER_ID")
    @JoinFetch(JoinFetchType.OUTER)
    public Employer getEmployer() {
        return employer;
    }
The above will lead to one query (instead of 3):
SELECT DISTINCT t1.ID, t1.FIRST_NAME, t1.EMPLOYER_ID, t0.ID, t0.START_DATE, t0.NAME, t0.END_DATE, t0.BENEFIT_TYPE, t0.EMPLOYEE_ID, t2.ID, t2.BUSINESS_NAME 
FROM EMPLOYEES t1 LEFT OUTER JOIN BENEFITS t0 ON (t0.EMPLOYEE_ID = t1.ID) LEFT OUTER JOIN EMPLOYERS t2 ON (t2.ID = t1.EMPLOYER_ID) 
WHERE (t1.ID = 1)

Finally, let's focus on optional attribute of @ManyToOne annotation. In our example both Employee and Benefit cannot exist without the Employer and Employee respectively, therefore we can change the mapping to:
    @ManyToOne(optional = false)
    @JoinColumn(name = "EMPLOYER_ID")
    public Employer getEmployer() {
        return employer;
    }
for the Employee, and:
    @ManyToOne(optional = false)
    @JoinColumn(name = "EMPLOYEE_ID")
    public Employee getEmployee() {
        return employee;
    }
for the Benefit.
This change will generate queries:
-- Hibernate 3.4.0.GA
select employee0_.id as id1_2_, employee0_.EMPLOYER_ID as EMPLOYER3_1_2_, employee0_.FIRST_NAME as FIRST2_1_2_, benefits1_.EMPLOYEE_ID as EMPLOYEE6_4_, benefits1_.id as id4_, benefits1_.id as id0_0_, benefits1_.EMPLOYEE_ID as EMPLOYEE6_0_0_, benefits1_.END_DATE as END2_0_0_, benefits1_.name as name0_0_, benefits1_.START_DATE as START4_0_0_, benefits1_.BENEFIT_TYPE as BENEFIT5_0_0_, employer2_.id as id2_1_, employer2_.BUSINESS_NAME as BUSINESS2_2_1_ 
from EMPLOYEES employee0_ left outer join BENEFITS benefits1_ on employee0_.id=benefits1_.EMPLOYEE_ID inner join EMPLOYERS employer2_ on employee0_.EMPLOYER_ID=employer2_.id where employee0_.id=1

-- EclipseLink 2.1.1 (without @JoinFetch)
SELECT ID, FIRST_NAME, EMPLOYER_ID FROM EMPLOYEES WHERE (ID = 1)
SELECT ID, BUSINESS_NAME FROM EMPLOYERS WHERE (ID = 1)
SELECT ID, START_DATE, NAME, END_DATE, BENEFIT_TYPE, EMPLOYEE_ID FROM BENEFITS WHERE (EMPLOYEE_ID = 1)

-- OpenJPA 2.0.1
SELECT t1.id, t1.BUSINESS_NAME, t0.FIRST_NAME, t2.EMPLOYEE_ID, t2.id, t2.END_DATE, t2.name, t2.START_DATE, t2.BENEFIT_TYPE FROM EMPLOYEES t0 INNER JOIN EMPLOYERS t1 ON t0.EMPLOYER_ID = t1.id LEFT OUTER JOIN BENEFITS t2 ON t0.id = t2.EMPLOYEE_ID WHERE t0.id = 1 ORDER BY t2.EMPLOYEE_ID ASC
Don't you think that default behavior of EclipseLink is a little strange? What would we do without the @JoinFetch? When we add it in the following way:
    @OneToMany(mappedBy = "employee", fetch = FetchType.EAGER)
    @JoinFetch(JoinFetchType.OUTER)
    public List getBenefits() {
        return benefits;
    }

    @ManyToOne(optional = false)
    @JoinColumn(name = "EMPLOYER_ID")
    @JoinFetch(JoinFetchType.INNER)
    public Employer getEmployer() {
        return employer;
    }
The world becomes beautiful again:
SELECT DISTINCT t1.ID, t1.FIRST_NAME, t1.EMPLOYER_ID, t0.ID, t0.BUSINESS_NAME, t2.ID, t2.START_DATE, t2.NAME, t2.END_DATE, t2.BENEFIT_TYPE, t2.EMPLOYEE_ID 
FROM EMPLOYEES t1 LEFT OUTER JOIN BENEFITS t2 ON (t2.EMPLOYEE_ID = t1.ID), EMPLOYERS t0 
WHERE ((t1.ID = 1) AND (t0.ID = t1.EMPLOYER_ID))
I know, you are already tired reading the above thoughts ;) - so let's finish for now - we will return to the @OneToMany and @ManyToOne soon with some additional deliberation on cascading and a little more complicated DB examples.

The boring part now ;) - read it at your own risk ;) - DB tables and their content (MySQL targeted)
CREATE TABLE BENEFITS (
    ID            NUMERIC     NOT NULL,
    EMPLOYEE_ID   NUMERIC     NOT NULL,
    BENEFIT_TYPE  VARCHAR(64) NOT NULL,
    START_DATE    DATE,
    END_DATE      DATE,
    NAME          VARCHAR(128), 
    CONSTRAINT BENEFITS_PK PRIMARY KEY (ID)
);

CREATE TABLE EMPLOYEES (
    ID           NUMERIC NOT NULL,
    EMPLOYER_ID  NUMERIC NOT NULL, 
    FIRST_NAME   VARCHAR(64), 
    CONSTRAINT EMPLOYEES_PK PRIMARY KEY (ID)
);

CREATE TABLE EMPLOYERS (
    ID             NUMERIC NOT NULL, 
    BUSINESS_NAME  VARCHAR(128), 
    CONSTRAINT EMPLOYERS_PK PRIMARY KEY (ID)
);

INSERT INTO EMPLOYERS (ID, BUSINESS_NAME) VALUES (1, 'Mighty Ducks');

INSERT INTO EMPLOYEES (ID, EMPLOYER_ID, FIRST_NAME) VALUES (1, 1, 'John');
INSERT INTO EMPLOYEES (ID, EMPLOYER_ID, FIRST_NAME) VALUES (2, 1, 'Mary');
INSERT INTO EMPLOYEES (ID, EMPLOYER_ID, FIRST_NAME) VALUES (3, 1, 'Eugene');

INSERT INTO BENEFITS (ID, EMPLOYEE_ID, BENEFIT_TYPE, START_DATE, END_DATE, NAME)
    VALUES(1, 1, 'HEALTH_COVERAGE', '2005-01-01', null, 'Healthy Employees');
INSERT INTO BENEFITS (ID, EMPLOYEE_ID, BENEFIT_TYPE, START_DATE, END_DATE, NAME)
    VALUES(2, 1, 'RETIREMENT_PLAN', '2005-01-01', null, 'Gold Autumn');    
INSERT INTO BENEFITS (ID, EMPLOYEE_ID, BENEFIT_TYPE, START_DATE, END_DATE, NAME)
    VALUES(3, 1, 'GROUP_TERM_LIFE', '2005-01-01', null, 'Always Secured');

Sunday, September 19, 2010

How buggy is your JPA provider ;) ? (episode 1)

My Colleague, Marcin Lewandowski has found the bug reported by me today to the Hibernate team - see: http://opensource.atlassian.com/projects/hibernate/browse/JPA-6

Very shortly - this bug bites ;) when you try to use @ManyToOne mapping with target entity using Single Table inheritance type - in this case Hibernate (3.4.0 GA) generates SQL which doesn't use the discriminator in the where clause, which in turn causes beautiful exception when processing the query results.

Both OpenJPA, and Eclipselink are generating correct SQL queries - for the details see the Hibernate bugtracker.