Range Lookups in the SSIS Lookup Transformation
Let’s say, in an SSIS Data Flow which is processing invoices, you wanted to match an invoice to an event based on the date of the invoice falling between the start and end date of the event. To achieve this in a SSIS Lookup Transformation, you need to use advanced configuration.
The configuration steps required follow:
- On the General page, select Partial Cache. This enables the ‘Modify the SQL Statement’ box on the Advanced Tab.
- On the Connection page, select the ‘lookup’ table as per normal, or enter a Query without a Where clause that selects only the columns you need. This step sets the Column Meta Data of the Lookup.
- On the advanced tab enter your custom query in the ‘Modify the SQL Statement’ box . See the example below for more information.
- Map input columns to the parameters, using the Parameters button on the advanced tab. See notes for gotchas.
Date Range Lookup Example
In the example described above, the Input to the Lookup Transform contains one row per invoice. The Event has a start date and and end date. Your reference query in the ‘Modify the SQL Statement’ box would be: SELECT * FROM [Warehouse].[dbo].[Events] WHERE ? BETWEEN [Event_From_Date] and [Event_To_Date]. The ? would be mapped to the InvoiceDate input Column. When the lookup executes, it executes the query for each Input row, substituting in the invoiceDate for the ? This returns only events that were happening at the date of the invoice. If a result set has been returned, a match has occurred, and the information for the event can then be included in the data flow. Hopefully there would only be a single event underway at the time, but if there were multiple concurrent events at the invoice date, multiple rows are returned. In this case, the last row is used by the Lookup Transform.
Tips
- In order to make Input Columns available for mapping to query parameters, it is necessary to map them between the ‘Available Input Columns’ and ‘Available Lookup Columns’ on the Columns page first. Otherwise they don’t appear in the list of available input columns in the Advanced page – Parameters dialog. In the case above I needed to map Invoice Date to Event_From_Date. This seems counter intuitive, because this is saying, only match if Invoice Date = Event_From_Date. Fortunately, in my testing, I found that this matching criteria is simply overridden by the custom query.
- The custom query must return the same rows as the table/view or Query you specified on the Connection page. Otherwise a Meta Data error is produced.
This is ingenuious. This is teriffic. Thank you very much.
thanks 🙂
Great post! Tip #1 is just what I needed. Thanks for sharing.
Doesn’t partial cache make the query result comparison case sensitive?
Many, many thanks. Info still current as of 2019. I’d never have figured out the bit about the input column parameter having to be part of, and mapped to, the lookup query.
Thank you !!