Tuesday, July 23, 2013

Pivoting data in SqlServer


While working on transforming data (ETL) from a Entity-attribute-value data model, there are several challenges. Once of which is to transform data between values in a column in table and a string representation (like all values concatenated in a comma separated string), which I already described in one of my blog.
Another interesting challenge is to be able to transform attribute-values to a row structure, so that the attributes become column names, and you can see all the values for the attributes in a row.
I used the pivot function in SqlServer to achieve this as described below.
You need to know the attribute names to be able to lay them across as columns, but it is easy to generate sql string using my previous blog on concatenating strings for rows of a column.


CREATE TABLE #EAV_Test (
 ID int not null,
  code VARCHAR(30) NOT NULL,
  value VARCHAR(200),
  CONSTRAINT pk_EAV_Test
  PRIMARY KEY (ID, code));

INSERT INTO #EAV_Test (ID,code, value) VALUES(1,'Risk', 'High');
INSERT INTO #EAV_Test (ID,code, value) VALUES(1,'Priority', 'Med');
INSERT INTO #EAV_Test (ID,code, value) VALUES(1,'type', 'personal');
INSERT INTO #EAV_Test (ID,code, value) VALUES(1,'date', '20080215');
INSERT INTO #EAV_Test (ID,code, value) VALUES(1,'amount', '3500.00');

INSERT INTO #EAV_Test (ID,code, value) VALUES(2,'Risk', 'Low');
INSERT INTO #EAV_Test (ID,code, value) VALUES(2,'Priority', 'High');
INSERT INTO #EAV_Test (ID,code, value) VALUES(2,'type', 'open');
INSERT INTO #EAV_Test (ID,code, value) VALUES(2,'amount', '250.00');

SELECT ID, code, value FROM #EAV_Test;


select ID,Amount,[Date],[Priority],[Risk],[Type]  from
(
select Id, Code,Value from #EAV_Test) as t
  pivot (max(value) FOR code in (Amount,[Date],[Priority],[Risk],[Type])) as pt
  order by ID;

/* Results for previous query
ID     Amount Date   Priority      Risk   Type
1      3500.00       20080215      Med    High   personal
2      250.00 NULL   High   Low    open
*/
 
drop table #EAV_Test







SQlServer Concatenating row values from a column in a table using ‘For XML’


When writing ETL jobs in SQL, transforming data between values in a column in table and a string representation (like all values concatenated in a comma separated string) can be a required step.
In my recent ETL project, while trying to de-normalize an Entity-attribute-value data model to a data warehouse friendly schema, I encountered a similar problem.
I had to concatenate the attribute names into a comma separated list, so I could generate the desired schema and issue queries using the ‘IN’ statement. I think I’d leave the details of the job and discussions or reasoning regarding my approach.
I was able to accomplish this using the‘FOR XML’ and a correlated subquery in sql server as shown below.

CREATE TABLE #EAV_Test (
 ID int not null,
  code VARCHAR(30) NOT NULL,
  value VARCHAR(200),
  CONSTRAINT pk_EAV_Test
  PRIMARY KEY (ID, code));

INSERT INTO #EAV_Test (ID,code, value) VALUES(1,'Risk', 'High');
INSERT INTO #EAV_Test (ID,code, value) VALUES(1,'Priority', 'Med');
INSERT INTO #EAV_Test (ID,code, value) VALUES(1,'type', 'personal');
INSERT INTO #EAV_Test (ID,code, value) VALUES(1,'date', '20080215');
INSERT INTO #EAV_Test (ID,code, value) VALUES(1,'amount', '3500.00');

INSERT INTO #EAV_Test (ID,code, value) VALUES(2,'Risk', 'Low');
INSERT INTO #EAV_Test (ID,code, value) VALUES(2,'Priority', 'High');
INSERT INTO #EAV_Test (ID,code, value) VALUES(2,'type', 'open');
INSERT INTO #EAV_Test (ID,code, value) VALUES(2,'amount', '250.00');

SELECT ID, code, value FROM #EAV_Test;

/* Get all distinct codes for an ID */
Select distinct ID,
stuff((select ','+cast(code as varchar(1000))
        from #EAV_Test as T2
        where T1.ID = T2.ID 
        for xml path(''), type).value('.', 'varchar(max)'),1,1,'') AllCodes
from #EAV_Test T1

/* Results for previous query
ID     AllCodes
1      amount, date, Priority, Risk, type
2      amount, Priority, Risk, type
*/

/* Get all values for each code */
Select distinct code,
stuff((select ','+cast(value as varchar(1000))
        from #EAV_Test as T2
        where T1.code = T2.code
        for xml path(''), type).value('.', 'varchar(max)'),1,1,'') AllCodes
from #EAV_Test T1

/* Results for previous query
code   AllCodes
amount 3500.00, 250.00
date   20080215
Priority      Med, High
Risk   High, Low
type   personal, open
*/

drop table #EAV_Test