Recently I wrote about the SQL Server workloads big picture, and now I would like to dig a bit more.
So, what really happens when an .NET application tries to carry out a workload on SQL Server and what consumes time during this process?
Let’s look at the following scenario: we have an application which connects to a SQL Server instance, requests some data and consumes it.
It is a very simple scenario, however there are many steps that happen before the data is finally consumed by the application.
First of all, we need a connection string to give to the application.
- Then a physical channel needs to be established (socket or named pipe),
- then a handshake with the server needs to be established,
- the the connection string needs to be parsed,
- the request must be authenticated and then the transaction must be formed and passed to the server.
So, let’s call this Connection Latency. In other words, Connection latency is the time between ‘pressing the button’ which fires the request and the time when the connection is established.
Of course, because of the ADO.NET connection pooling functionality, there will be some difference between runs, depending on whether a connection has already been established before with this particular connection string or not.
The latencies here strongly depend on the network connectivity, bandwidth and whether the server is local to the requester or remote. (I.e. whether the application is on the same physical host as the SQL Server or the application is local and connecting to SQL Azure).
Requesting the data…
The next step, after establishing the connection is to send the request to SQL Server.
This step is a bit more complex than the connection and includes:
- converting the text request of the query into Tabular Data Stream packet(s) and submitting it over the network to the server,
- receiving of the TDS packets by the server,
- verifying that they are error-free,
- assembling the packets (if the text of the query request is larger than the size of the packet defined by the network settings, then the packets will need to be transported in several ‘chunks’),
- then after the packets are assembled, they have ti be parsed, algebrized and optimized
- then the request is carried out by the Relational Engine and data is requested from the Storage Engine, cached in memory and eventually packed in TDS packets again and ready to be sent to the requester.
This entire time between receiving the TDS packets, optimizing and gathering data up to the point when the first TDS packet is sent back to the application can be called Query Processing Latency.
Receiving the data…
The next step is to receive all data in the form of TDS packets and to assemble them by the application. Again, this process depends on the network settings and the topology of the server setup.
The entire time between the sending of the first TDS packet back to the application to the time of assembling the last TDS packet can be called Data Receiving Latency.
The sum of all latencies…
So, in all, we have a Total Latency, which is composed of the Connection Latency + Query Processing Latency + Data Receiving Latency.
For the sake of precision, they are measured in milliseconds.
Now that I have explained the basics, let’s have some fun with this…
Stay tuned for some tests.