Pages

Thursday, August 18, 2011

Delete Empty Excel work sheets dynamically through SSIS

Yesterday I visited the SSIS forum and come across this requirement (delete empty excel sheets dynamically from a excel workbook). In SSIS we have to write the Script in Script Task. We've to use COM API to do this.


Follow below steps to solve this problem.

Step1: Open your BIDS and Use a Script Task  under the control flow.

Step 2: Select the Script language in which you are comfortable. Here I'm using the VB Script and than click on the Edit Script button.

Step3: In the Script Editor you need to add the "Microsoft.Office.Interop.Excel.dll" in your reference.

if you want to use this DLL, following these steps:

  • On the Project menu, click Add Reference.
  • On the COM tab, locate Microsoft Excel Object Library, and then click Select. In Visual Studio 2008, locate Microsoft Excel 12.0 Object Library on the COM tab.
  • Click OK in the Add References dialog box to accept your selections. If you are prompted to generate wrappers for the libraries that you selected, click “Yes”.



Step 4: you need to import the following reference in your code.
Imports Microsoft.Office.Interop.Excel


Step 5: Write the following code and change the appropriate path for your excel file ( you can use the SSIS variable for passing the file name)



Public Sub Main()
        Dim Wexcel As Microsoft.Office.Interop.Excel.Application = New Microsoft.Office.Interop.Excel.Application()
        Dim Wb As Workbook = Wexcel.Workbooks.Open("C:\emptyExcelSheet.xls")
        Dim wf As WorksheetFunction = Wexcel.WorksheetFunction


        For Each sheet In Wexcel.Worksheets 
            If wf.CountA(sheet.Cells) = 0 Then
                Wexcel.DisplayAlerts = False
                sheet.Delete()
                Wexcel.DisplayAlerts = True


            End If
        Next sheet
        Dts.TaskResult = ScriptResults.Success
    End Sub

Step 6: Build the code first and then close the script editor. Now you can execute your package and test the results.

Hope this helps you..


Wednesday, August 10, 2011

SSIS - Denali vs 2008 - Control Flow

Hurry !!! Denali (SQL Server 2011) is available for all Techic's now. In the last post I've explained about the installation steps for denali. After the installation, I opened the SSIS and I found a single new member(Task) in control flow where as missing below 2 tasks:



Below image is showing the SSIS Denali vs SSIS 2008 Control Flow tool list. There is NO big change in the SSIS again(after a first look) ...

               SSIS 2008                                                                         SSIS Denali



SQL SERVER :: Denali (2012) :: Step By Step Installation Guide

New era is available in the database and BI market and its name is "SQL Server Denali". Believe me, you'll definitely love to work on it. Yesterday night I installed this on my laptop and here I'm sharing my experience with you.

Installation is quite straight-forward, but you would find additional tab pages while installation. In the below screen you can see new features for selection like Data Quality Services, Data Quality Client, Distributed Replay Controller, and Distributed Replay Client.

You can download it from following link

http://www.microsoft.com/sqlserver/en/us/future-editions.aspx
http://www.softpedia.com/get/Internet/Servers/Database-Utils/Microsoft-SQL-Server.shtml
(This is free till today... Not sure if you need to pay tomorrow)



After download you will get 3 files in your folder and once you click on the installation file you will get the last file (SQLFULL_x86_ENU_Lang.box)



Once you click on application file it'll extract the Set-up file and all supporting file in the folder named "SQLFULL_x86_ENU".

You'll find the following file and folders in the "SQLFULL_x86_ENU" folder.


Important: While Installation you must be connected to internet for the updates (Optional). It'll download approximately 500MB data for installation.


Now you need to double click on Setup.exe file in this folder and get ready for some more configuration setting which is important while for the installation. This installation quit similar to SQL Server - 2008.


Now you can see the default window on your screen. It is showing the Planning tab.. as follow.
Here you can analyze your requirement and do the planning accordingly .


Next is Installation tab, where you can select the appropriate option for the installation. Here I'm selecting the first option "New SQL Server Stand-alone installation"


Now I'm just adding few Images which will come across your installation  process where you need not to do more. Simply click to next. :)

Setup Support Rules, simply check the following operation on your system. If its fail you need to rectify that and then you need to start again. After this you just go with the flow.







Product updated are optional here... if you need to install these updates you need to connect to internet.


As mention in the above screen-shot that "Setup Updates" will be installed immediately.



Setup Support Rules again check the set related requirement before starting the installation.







Here I'm creating the another instance (named) of SQL Server Denali with the "Denali" name.



Installation Wizard uses the some default services accounts for running the services. you can change these services to your name.


You can change these service Accounts to your account.. I did it (This is optional)












And SQL Server Denali is ready to server you.. Let me know if you are facing any problem during the installation.






......

Friday, March 25, 2011

Handling extra/missing columns in Delimited Flat file

Delimited Flat files (CSV) are one of the important non-relational source for SSIS ETL but if we get robust file all the time then we can use inbuild flat file source in SSIS. Surprisingly we are not that much lucky and get the data in which columns are missing (or we can say extra columns) like as follow...
Problem Reference(Click here) 



So handling such a file is very problematic for a SSIS ETL developer.Here I'm giving you some work around for such cases


  1. SSIS Delimited File Source available at Codeplex  (Click here)
  2. Script Component--- Check Sudeep's Blog (Click here)
  3. Flat file source which reads all data into one column and then splits them based on index of the comma in a derived column.
Here I'm giving you the demonstration to handle such file with the help of derived column.We have to write expression (sometime complex) to get the proper data in column format. In the above example I am getting all the data as String and after splitting we are saving it as string only (If you want to change the data type then you can use Data Conversion task). If number of columns are more then this approach is not recommended, In that case you should go with the first 2 approach.

Step1: Inside the Data Flow Task (DFT), Select a Flat File Source and Derived column.


Step2: Configure the FlatFile Source in such a way that you will get all input data in single column.


Note:  We are keeping the Column name as "C" in the example.

Step3: Double click the derived column and use the following logic.

  • For separating the column "C1" from the column "C", we need to fetch the data from position 1 to first occurrence of delimiter " " so finding the first delimiter we have to use FINDSTRING SSIS function (Returns the location of the specified occurrence of a string within a character expression).
  • SUBSTRING(C,1,FINDSTRING(C,",",1) - 1)
  • Here we have to do "-1" because we have to exclude delimiter from the column value.
  • For separating the column "C2" from the column "C", we have to fetch the data after the first delimiter to 2nd delimiter but before that we have to check whether we have 2nd delimiter in place

  • FINDSTRING(C,",",2) != 0 ? SUBSTRING(C,FINDSTRING(C,",",1) + 1,FINDSTRING(C,",",2) - FINDSTRING(C,",",1) - 1) : SUBSTRING(C,FINDSTRING(C,",",1) + 1,LEN(C) - FINDSTRING(C,",",2))  
  • For separating the column "C3" from the column "C"
  • FINDSTRING(C,",",2) == 0 ? "NULL" : (FINDSTRING(C,",",3) != 0 ? SUBSTRING(C,FINDSTRING(C,",",2) + 1,FINDSTRING(C,",",3) - FINDSTRING(C,",",2) - 1) : "NULL")

  • For separating the Last column "C4" from the column "C",we have to fetch the data after the third delimiter to end of the row. and for calculating the last position of the row, we have to use LEN SSIS function
  • FINDSTRING(C,",",3) == 0 ? "NULL" : SUBSTRING(C,FINDSTRING(C,",",3) + 1,LEN(C) - FINDSTRING(C,",",2))
Step4: And result would be as expected..


Note: Let me know your valuable comments to make this post more robust and helpful..