Sunday, September 28, 2014

SQL Query Fundamentals (Part 1)

When I was a fledgling consultant, SQL was not the ubiquitous backbone of Dynamics GP that it is today.  Back then, Great Plains ran on C-Tree and Btrieve databases engines.  Microsoft Access, connected to these databases via ODBC, was the power tool for data analysis and custom reporting.

Using Microsoft Access was fairly straight-forward.  Adding tables to a query immediately displayed the fields in the table. Position the main table on the left, and the related table(s) to the right, and drag an arrow from the common fields in one table to the other(s), and instant join. If you didn't get the data you wanted, simply right click on the connecting arrow and choose from options to connect the tables displayed in English.

Now, one of the most versatile tools in the Dynamics GP toolkit, is SQL Query Analyzer. In stark contrast to Microsoft Access, when you open SQL Query Analyzer, you are greeted with a blank screen.  This is reminiscent of the days of DOS, when you started your computer, and were rewarded with a blinking command prompt.

The safest of SQL Scripts are Select Statements.  This doesn't mean they are foolproof or without risk. If you execute a select statement on a big enough table, or link a few large tables together and throw in a WHERE clause with an AND or an OR, and you can cripple SQL performance.

Here's a sample SQL script for to illustrate some key concepts:

If you analyze the results of this query, you'll notice columns are returned side-by-side; the first iteration of each column has been transformed using various methods and the second is displayed as it normally appears without modification. 

Doc_Type in text vs. SOPTYPE as integers
Doc_Date in US Format vs. DOCDATE in SQL Date/Time format 
Line_No in Integers vs. LNITMSEQ in increments of 16,384

So let's talk fundamentals:

The first step in query design process is to select some data from a table. This is typically accomplished using a simple select statement, akin to "Select COLUMN from TABLE," or "Select * from TABLE." What is important to know is informing SQL what table(s) you desire to select data from allows SQL to help you with the process.  Once SQL knows the table(s) you're querying against, it will prompt you with valid column names, which belong to these tables.

Enter the humble asterisk, Shift+8 or *. A.K.A. "star" as in *.* or "star dot star." In the DOS days, *.* was code for show me everything.  The humble asterisk performs much the same function in SQL. 

Starting with all fields is advisable, as this will allow you to see all the columns and data in a table, which will help you build a better query.  If the table is big, you'll want to add a WHERE clause (at the end of the query) to restrict the data set, so you're not tying up system resources - see WHERE clauses below.

In the end, you won't need every column in every table you've joined, you're typically looking for very specific data, which is why you joined the tables in the first place. 

Returning specific fields requires being very specific.  To return only the columns you desire, you need to call them out by name in TABLE.COLUMN format. It's not always absolutely necessary, but becomes absolutely critical when your joined tables contain the same columns, and your're returning this data. 

Failure to spell out the complete table and column can result in errors about ambiguity. Errors about ambiguous data is SQL saying, I see you're asking for SOPNUMBE, but that column is in both tables, which one did you mean? Since joining tables requires matching columns, it is typical for the tables to have duplicate columns. You should separate columns with a comma (i.e. TABLE-A.COLUMN-A as ALIAS-1, TABLE-A.COLUMN-B as ALIAS-2)

Additionally, you're going to want to produce usable data, and not all data in SQL makes sense as it is stored.  So, lets talk transformation...

Aliases are your friend.  It is easy to ALIAS a column, which changes the display name, not the actual name.  Just type whatever name you'd like to appear as the column name (header) to the right of the specific column.  

The same method can be applied at the table level.  You can type a more understandable and/or easier to remember name to the right of the table(s) you are selecting data from. You may have noticed, I typed AS between the table name and the associated aliases. When creating an alias, the word AS is totally optional.

Cases are sometimes absolutely necessary.  A CASE evaluates the values in the a table and performs actions, like transformation based on these values.  In my example query, the CASE statement looks at the value returned in the SOPTYPE field and converts it to understandable text.  Of course you need to know what these values equate to, in order to write your case, so you'll want to do your research prior to finalizing your query.

Conversions and equations can help transform data from one type or format to another.  In this query, conversions and equations were used to change the default date format "YYYY-MM-DD HH:MM:SS.HoS" for the DOCDATE field to a common and clearer US date format "MM/DD/YYYY".  Additionally, the LNITMSEQ field is converted from increments of 16,384 to simple and understandable integers. Conversions can also be used to trim trailing spaces, select part, rather than all of a data set and other very useful operations.

Joins allow related tables to be linked, so reports can include all the required data, not just the data in the core table.  In this instance, the tables have been joined using an "Inner Join" on Sales Order Number and Sales Order Type. The SOP10100 table contains Sales Order Header information and the SOP10200 table contains Sales Order Lines.  These tables are the Dynamics GP Sales Order open tables, and do not contain historical data. 

This query makes use of an Inner Join.  Inner Joins include data in both linked tables where there are matching records in the linked fields in both tables. There are many kinds of Joins, with various uses.  The chart below is one of the best I've seen on explaining these various joins and their purpose. The Inner join is visually represented in the center of the illustration below, which shows the data set returned is only data which overlaps in the two tables. In my career, I have found a practical use for every join illustrated below.

Finally, WHERE clauses allow you to specify/restrict the data you would like displayed. Because of the WHERE clause, used in the sample query, the query only displays data for one Sales Order. WHERE clauses are also useful when defining date ranges, or identifying orders containing specific items, or entered on for specific customers, etc.

This overview should provide the fundamentals for writing queries to retrieve data from Dynamics GP.  In order to make use of this tool, you must know or identify where to find the data you're looking for.  I can think of no better source for such information than the following link.

There are plenty of table reference sites out there, but Victoria's site is well organized, and contains other useful information like the translations for statuses and document type field data, which is instrumental in writing CASE statements.

No comments:

Post a Comment