Wednesday, May 25, 2011

Unpivot data transformation in SSIS

I get sales product data in an excel spreadsheet. This file holds yearly sales data of all the products. Following is the excel template business users are using to capture sales data month to month. The requirement is to unpivot the sales data and store in a table with appropriate snapshot. Find below file template with sample data -

image

Find below table structure where above file has to be imported using package.

CREATE TABLE [DBO].[TTB_PRODUCT_SALES](
[PRODUCT_NAME] NVARCHAR(100
) NOT NULL,
[SALES_AMT] DECIMAL(15,4
) NULL,
[SNAPSHOT_DATE] VARCHAR(50
) NULL
)

Next step would be to create a SSIS package to import data from excel file into the database table recursively such that all these months are loaded. Note that we can do this using unpivot transformation task.
Steps to create package - 


  • Go to your integration service project in your solution and add new SSIS package.

  • Add a data flow task under Control Flow tab as under -
    image

  • Double click on the above task to go to the data flow tab. Next, right click on Connection Managers tab and select Excel from the connection manager list, click on Add… button and select sales excel as your source. Make sure you checked First row has column names in the Excel Connection Manager.
    image

  • You can add data viewer and run the package to view current state of dataset. It will be like this  -
    image

  • Add Unpivot transformation tasks to unpivot sales columns into rows. Double click on Unpivot transformation task and editor as under -
    image

    As you see we have to pass through Product as we want to keep product and transform other columns (months like Jan, Feb etc) into rows. Meaning, month columns will be converted into rows for each product. Set Destination Column to SALES_AMT to store unpivoted value in the database table. Set Pivot key value column name to “PKeyMonth”. This variable will hold value of the pivot key value. In our example we have put month name as key (Jan, Feb etc). Add a data viewer to check incoming rows and variable value -
    image

  • Let’s add a script component task to read “PkeyValue” value and convert it into snapshot date. Select Transform in the Select Script Component Type dialog. In the Script Transformation Editor go to Inputs and Outputs tab and add a output column name SnapshotDate. Next switch to Script tab and place following script to assign snapshot date value to newly added variable. We will use this variable to set snapshot in the database table.

    public override void
    Input0_ProcessInputRow(Input0Buffer Row)
    {
    string varSnapShotDate = string.Empty;
    switch (Row.PKeyMonth)
    {
    case "Jan": varSnapShotDate = "2011-01-31"; break;
    case "Feb": varSnapShotDate = "2011-02-28"; break;
    case "Mar": varSnapShotDate = "2011-03-31"; break;
    case "Apr": varSnapShotDate = "2011-04-30"; break;
    case "May": varSnapShotDate = "2011-05-31"; break;
    case "Jun": varSnapShotDate = "2011-06-30"; break;
    case "Jul": varSnapShotDate = "2011-07-31"; break;
    case "Aug": varSnapShotDate = "2011-08-31"; break;
    case "Sep": varSnapShotDate = "2011-09-30"; break;
    case "Oct": varSnapShotDate = "2011-10-31"; break;
    case "Nov": varSnapShotDate = "2011-11-30"; break;
    case "Dec": varSnapShotDate = "2011-12-31"; break;
    }
    Row.SnapshotDate = varSnapShotDate;
    }

  • Your SSIS Package data flow tab should look like this. Run the package and query database table
    (
    SELECT * FROM [DBO].[TTB_PRODUCT_SALES]
    ).
    image

No comments:

Post a Comment