Migrating Time Series Data to SQL Server from Yahoo Finance and Stooq.com

By:   |   Updated: 2022-02-24   |   Comments   |   Related: More > TSQL


Problem

Please demonstrate how to import to SQL Server price and volume time series data for financial securities from Yahoo Finance and Stooq.com. Illustrate the user interfaces from each data vendor for configuring csv files with downloaded data. Highlight differences in the file formats between vendors and their impacts on how to import csv files to SQL Server. Evaluate the similarity of time series data for a handful of ticker symbols from the two financial data vendors.

Solution

There are multiple data vendors that provide free downloading of historical time series price and volume data for ticker symbols traded in various markets around the world. MSSQLTips.com recently demonstrated one approach to this task with Stooq.com. This tip illustrates a different approach that may be better suited for gathering data on a relatively small number of ticker symbols. Another highly regarded stock market data vendor is Yahoo Finance. You should be cognizant that vendors provide different ways of displaying, downloading, and charting time series data. Furthermore, these ways can and do change over time. Therefore, this tip offers a snapshot of an evolving series of ways of downloading stock market data.

It is good to know about multiple ways to acquire time series financial securities data because free data availability is in a constant state of flux. It is unwise to base your work on data from a single vendor. For example, Google within the past five years or so provided free time series financial data downloads, but the firm discontinued this service. Yahoo Finance used to make its financial securities data available through a programmatic interface to Python developers. However, Yahoo Finance ceased supporting this option for getting financial time series downloads. Happily, we can still download financial securities from Yahoo Finance via a manual user interface. In the years that I have been tracking financial data available for download without a charge, Stooq.com is gaining notoriety.

This tip demonstrates user interfaces from two vendors. Their user interfaces offer a no-code solution to acquiring time series data for ticker symbols in csv files. The interfaces allow you to configure a separate csv file for each ticker symbol.

You will also learn how simple it is with the T-SQL bulk insert statement to import and concatenate the csv files for multiple ticker symbols into a single SQL Server table. Because the csv file formats and naming conventions are inconsistent across the two vendors, slightly different specifications for the statement are required for the csv files from each vendor.

After the time series data are loaded into SQL Server, the tip concludes with a top level comparison of the downloaded data from each vendor. This comparison will acquaint you differences between the two data sources. It will also reinforce your skills for performing basic time series analysis in SQL Server.

User interfaces for downloading time series data from Stooq.com and Yahoo Finance

The aim of this section is to illustrate the basics of downloading a csv file with historical price and volume data for a ticker symbol from both Stooq.com and Yahoo Finance. A separate subsection is devoted to the user interface from each vendor.

Using the Stooq.com user interface

The aim of this subsection is to present a series of screens that walk you through the Stooq.com process for downloading a csv file with historical price and volume data for a ticker symbol. This process yields one csv file per ticker symbol. This subsection highlights the approach for downloading data for ticker symbols from US stock markets, but the Stooq.com site supports data from other countries as well.

You can commence the process of downloading a csv file with historical data for a ticker symbol by navigating to Stooq.com and entering a ticker symbol followed with a period and its country’s abbreviation. For example, to access historical data for the Ford Motor Company ticker in the US, enter F.US in the Symbol box on the home page. Clicking the F.US symbol in a drop-down list navigates you to another page with a chart of selected data on the right side of the screen and some menu selections on the left side of the screen.

stoog stock data

Here is an excerpt from the screen to which clicking the F.US ticker symbol navigates you. As you can see, the cursor rests on the Historical Data menu item.

stoog stock data

The next screen excerpt shows selections for configuring the historical data that displays.

  • The default interval selected is Daily. Therefore, price and volume display for the end of each trading date in the date range from the Start date through the End date.
  • Stooq.com automatically populates, respectively, the start and end date fields
    • The start date is set to the start of available data for the designated ticker symbol.
    • The end date is set to the most recently available date with a close price for the trading day.
  • You can edit the start and end fields according to your historical data requirements. For example, the displayed historical data can begin at a date after the first date for which data is available at Stooq.com.
  • The screen shot below shows a selection of historical data from the earliest date through the most recent date with end-of-day prices and volume of shares traded for the Ford Motor Company. The screen shot was taken on January 22, 2022, which is a Saturday when US stock markets are closed. Therefore, the most recent date for which data is available is Friday January 21, 2022.
stoog download stock data

Click the double greater than sign (double arrows) to display rows of data towards the end date. The following screen shot shows data rows towards the end date followed by a text menu item. Clicking the menu item saves a csv file in the Windows Downloads folder for the Windows user account that you are using.

stoog download stock data

The next pair of screen shots shows the first and last five rows in the NotePad++ application of the csv file generated by clicking the menu item in the preceding screen shot.

  • The file is automatically named f_us_d.csv because it contains time series data for the Ford Motor Company ticker listed on a US stock exchange (F.US). Additionally, the data are daily.
  • Lines terminate with a carriage return (CR) followed by a linefeed (LF)
  • Each row has six data items
    • A date value
    • Four price values named: Open, High, Low, and Close for a security on a date
    • An integer value for the numeric volume of shares traded on a date
raw stock data
raw stock data

Using the Yahoo Finance user interface

This subsection shows a set of Yahoo Finance screen shots that illustrate the steps for generating a downloaded csv file with price and volume data over the same range of dates as in the preceding subsection for Stooq.com.

Start by navigating your browser to the Yahoo Finance home page. Then, enter the Ford Motor Company ticker symbol (F) into the search box at the top of the page as indicated in the following screen shot. Select the ticker for which you seek historical data from the drop-down list box.

yahoo finance stock data

This action leads you to the home page for the security on the Yahoo Finance website. This page appears below. The text "Ford Motor Company (F)" is highlighted for those who may not be familiar with the layout of content at Yahoo Finance. By clicking the Historical Data tab, you can move on to a screen that allows you to configure and view the historical data that you want in a csv file.

yahoo finance stock data

Next, click the Time Period link within the Historical Data folder. Use the configuration box that opens to set start and end dates for the data to be downloaded. Set the start and end dates to the first trading date after the date for which you seek data. If you set configuration box for a trading date for which you seek data, that trading date will be excluded from the downloaded csv file (during normal trading hours). In contrast to Yahoo Finance, the Stooq.com user interface allows you to simply set the start and end dates to the dates that you want in your downloaded csv file.

The default frequency for historical data is daily with Yahoo Finance, but you can click the Frequency link to change the setting to weekly or monthly. With the Stooq.com user interface, your frequency options grow from three to five so that you can get data downloaded on a daily, weekly, monthly, quarterly, or even yearly basis.

yahoo finance stock data

After making your configuration settings for the downloaded data, choose Apply to display the data that will be downloaded. The following screen shot displays the three most recent data rows from the historical dataset. Notice that there are two close price columns that appear as: Close* and Adj Close**; the asterisks point to footnotes at the bottom of the page. Traders, who have a shorter horizon often use the Close column, and investors, who have a longer horizon, frequently prefer to analyze stock performance with the Adj Close column values. The most important point about this screen in the context of this tip is that downloaded historical data from Yahoo Finance includes a column (Adj Close) that is not included in the downloaded historical data from Stooq.com.

yahoo finance stock data

Clicking the Download link transfers a copy of the configured data as a csv file to the Windows Downloads folder for the Windows account from which you are running your browser session. The filename in this case is F.csv.

yahoo finance stock data

The next pair of screen shots display the first and last five data rows from Yahoo Finance for the Ford Motor Company. There are three main distinctions between the following csv file and the one generated by Stooq.com in the preceding subsection.

  • The file names are different. Both filenames adhere to default naming conventions from their respective data vendors.
    • The filename from Yahoo Finance is F.csv.
    • The filename for Stooq.com is f_us_d.csv.
  • There is an extra column in the csv file from Yahoo Finance relative to the csv file from Stooq.com. The extra column has the name Adj Close.
  • Finally, the line terminators in the csv files are different between the two vendors.
    • The line terminator from Yahoo Finance is a linefeed (LF).
    • The line terminator from Stooq.com is a carriage return followed by a linefeed (CRLF).

All three of these distinctions impact the T-SQL code for importing the csv file data from Stooq.com and Yahoo Finance, respectively.

raw stock data
raw stock data

T-SQL code for importing csv files for multiple tickers from Stooq.com

The T-SQL code for migrating data from Stooq-generated csv files to SQL Server relies on three tables.

  • #temp_ohlcv_values is a temp table for holding csv file values
    • Data are migrated from the csv file for each ticker symbol to the temp table
    • The temp table passes its contents to a regular table that has columns for each column in the csv file plus one additional column for the ticker symbol
    • This tip collected data for 12 ticker symbols
  • stooq_ohlcv_values_with_symbol is the name of the regular table that permanently holds the time series data for all symbols with downloaded data. This table has new rows inserted into it for each ticker symbol.
  • table_of_symbols stores the ticker symbols that are processed. There is one row for each ticker symbol used in this tip. There are two columns in this table
    • One column contains numeric identifiers for each symbol used in a run of the script. This column also facilities iterating through the rows of the table with a while loop.
    • The other column contains a string value for each ticker symbol.

The following script excerpt shows the code for creating all three tables and for populating the table_of_symbols table.

  • The script starts with a use statement to reference a default database (named DataScience in this tip).
  • Three drop table if exits statements remove any prior versions of the three tables in the dbo schema of the default database or the temp database.
  • A create table statement for the #temp_ohlcv_values table creates a fresh version of the temp table. Because this table is to be populated by the csv file downloaded by Stooq.com, the columns of the table match those in the csv file.
  • Another create table statement (for the stooq_ohlcv_values_with_symbol table) gets iteratively populated by the #temp_ohlcv_values table. The role of the stooq_ohlcv_values_with_symbol table is to store data rows for all the symbols in the table_of_symbols table.
  • A third create table statement creates a fresh version of the table_of_symbols table. A trailing insert statement populates the table with the twelve symbols used in this tip.
  • The script excerpt ends with a select statement that displays the rows in the table_of_symbols table. By comparing the symbols from this table with the distinct symbols in the stooq_ohlcv_values_with_symbol table, a user can confirm that all symbols contributed time series data to the stooq_ohlcv_values_with_symbol table.
use DataScience
go
 
drop table if exists dbo.table_of_symbols;
drop table if exists #temp_ohlcv_values;
drop table if exists dbo.stooq_ohlcv_values_with_symbol;
 
create table #temp_ohlcv_values(
 date date
,[open] money
,high money
,low money
,[close] money
,volume bigint
)
go
 
create table dbo.stooq_ohlcv_values_with_symbol(
 symbol nvarchar(10)
,date date
,[open] money
,high money
,low money
,[close] money
,volume bigint
)
go
 
create table table_of_symbols(
 symbol_number int
,symbol nvarchar(10)
)
go
 
-- populate table_of_symbols
insert into dbo.table_of_symbols
values  (1,  'DIA')
   ,(2,  'F')
   ,(3,  'GM')
   ,(4,  'MARA')
   ,(5,  'QQQ')
   ,(6,  'RIOT')
   ,(7,  'SOXX')
   ,(8,  'SPXL')
   ,(9,  'SPY')
   ,(10, 'TQQQ')
   ,(11, 'TSLA')
   ,(12, 'UDOW') 
 
-- display rows in dbo.table_of_symbols
select * from dbo.table_of_symbols

The next script excerpt is for populating the stooq_ohlcv_values_with_symbol table with the rows from the csv file created by the manual steps described in the "Using the Stooq.com user interface" subsection of the preceding section.

  • The script starts with a declare statement for three local variables that facilitate navigation through a while loop
    • The @symbol_number variable facilitates the selection of a row by the symbol_number column value from the table_of_symbols table.
    • The @symbol local variable matches the symbol value for a row from the table_of_symbols table.
    • The @max_symbol_number sets a limit for the number of passes through the while loop. The while loop continues for as long as @symbol_number is less than or equal to @max_symbol_number.
  • Within the while loop, there is an if…else statement with twelve branches – one for each of the twelve symbols used in this tip.
    • Each branch migrates the rows from the csv file for a symbol to the stooq_ohlcv_values_with_symbol table.
    • All branches have the same structure, but selected string values, such as the string in the condition for entering each branch and the path and filename constant for a csv file, eliminate the need to use dynamic sql code and enhances security for the SQL Server from attacks by bad actors.
    • The statements within each branch are bound by a begin…end block statement. The roles of the statements within each block are to
      • perform a bulk insert from a csv file to the #temp_ohlcv_values table
      • transfer the contents of the #temp_ohlcv_values table while adding a new column that designates the ticker symbol for each set of rows from a csv file
  • After the if…else statement branches, cleanup statements are invoked for the next pass through the loop.
    • The rows of the #temp_ohlcv_values are truncated.
    • The value of the @symbol_number local variable is incremented by 1.
    • The value of the @symbol local variable is updated for the new value of the @symbol_number local variable.
  • At the conclusion of the while loop
    • A select distinct statement enumerates the distinct symbols in the stooq_ohlcv_values_with_symbol table. The symbols from the list can be compared to those in the select statement just before the declare statement at the start of this excerpt.
    • An optional select statement, which is commented, is available to display all the rows in the stooq_ohlcv_values_with_symbol table.
-- declare @symbol for current symbol and related local variables
declare 
 @symbol_number int = 1
,@symbol nvarchar(10) = (select symbol from dbo.table_of_symbols where symbol_number = 1)
,@max_symbol_number int = (select max(symbol_number) from dbo.table_of_symbols)
 
---------------------------------------------------------------------------------
 
-- loop through the rows of the table_of_symbols table
while @symbol_number <= @max_symbol_number
begin
 
-- import ticker symbol rows from a csv file into #temp_ohlcv_values
-- before inserting the temp table into the stooq_ohlcv_values_with_symbol table
-- use the if and else clauses inside the while loop to control which csv file
-- populates a set of rows on each pass through the loop.
if 'DIA' = @symbol
begin
 bulk insert #temp_ohlcv_values
 from 'C:\DataScienceSamples\Stooq_vs_YahooFinance_for_12_tickers\12_from_stooq\dia_us_d.csv'
 with( FORMAT='CSV' , FIRSTROW=2);
-- insert #temp_ohlcv_values into stooq_ohlcv_values_with_symbol
-- and populate symbol from @symbol
 insert dbo.stooq_ohlcv_values_with_symbol
 select @symbol symbol, *
 from
 (
  select [date]
        ,[open]
        ,[high]
        ,[low]
        ,[close]
        ,[volume]
  from #temp_ohlcv_values
 ) for_stooq_ohlcv_values_with_symbol
end
else if 'F' = @symbol
begin
 bulk insert #temp_ohlcv_values
 from 'C:\DataScienceSamples\Stooq_vs_YahooFinance_for_12_tickers\12_from_stooq\f_us_d.csv'
 with( FORMAT='CSV' , FIRSTROW=2);
-- insert #temp_ohlcv_values into stooq_ohlcv_values_with_symbol
-- and populate symbol from @symbol
 insert dbo.stooq_ohlcv_values_with_symbol
 select @symbol symbol, *
 from
 (
  select [date]
        ,[open]
        ,[high]
        ,[low]
        ,[close]
        ,[volume]
  from #temp_ohlcv_values
 ) for_stooq_ohlcv_values_with_symbol
end
else if 'GM' = @symbol
begin
 bulk insert #temp_ohlcv_values
 from 'C:\DataScienceSamples\Stooq_vs_YahooFinance_for_12_tickers\12_from_stooq\gm_us_d.csv'
 with( FORMAT='CSV' , FIRSTROW=2);
-- insert #temp_ohlcv_values into stooq_ohlcv_values_with_symbol
-- and populate symbol from @symbol
 insert dbo.stooq_ohlcv_values_with_symbol
 select @symbol symbol, *
 from
 (
  select [date]
        ,[open]
        ,[high]
        ,[low]
        ,[close]
        ,[volume]
  from #temp_ohlcv_values
 ) for_stooq_ohlcv_values_with_symbol
end
else if 'MARA' = @symbol
begin
 bulk insert #temp_ohlcv_values
 from 'C:\DataScienceSamples\Stooq_vs_YahooFinance_for_12_tickers\12_from_stooq\mara_us_d.csv'
 with( FORMAT='CSV' , FIRSTROW=2);
-- insert #temp_ohlcv_values into stooq_ohlcv_values_with_symbol
-- and populate symbol from @symbol
 insert dbo.stooq_ohlcv_values_with_symbol
 select @symbol symbol, *
 from
 (
  select [date]
        ,[open]
        ,[high]
        ,[low]
        ,[close]
        ,[volume]
  from #temp_ohlcv_values
 ) for_stooq_ohlcv_values_with_symbol
end
else if 'QQQ' = @symbol
begin
 bulk insert #temp_ohlcv_values
 from 'C:\DataScienceSamples\Stooq_vs_YahooFinance_for_12_tickers\12_from_stooq\qqq_us_d.csv'
 with( FORMAT='CSV' , FIRSTROW=2);
-- insert #temp_ohlcv_values into stooq_ohlcv_values_with_symbol
-- and populate symbol from @symbol
 insert dbo.stooq_ohlcv_values_with_symbol
 select @symbol symbol, *
 from
 (
  select [date]
        ,[open]
        ,[high]
        ,[low]
        ,[close]
        ,[volume]
  from #temp_ohlcv_values
 ) for_stooq_ohlcv_values_with_symbol
end
else if 'RIOT' = @symbol
begin
 bulk insert #temp_ohlcv_values
 from 'C:\DataScienceSamples\Stooq_vs_YahooFinance_for_12_tickers\12_from_stooq\riot_us_d.csv'
 with( FORMAT='CSV' , FIRSTROW=2);
-- insert #temp_ohlcv_values into stooq_ohlcv_values_with_symbol
-- and populate symbol from @symbol
 insert dbo.stooq_ohlcv_values_with_symbol
 select @symbol symbol, *
 from
 (
  select [date]
        ,[open]
        ,[high]
        ,[low]
        ,[close]
        ,[volume]
  from #temp_ohlcv_values
 ) for_stooq_ohlcv_values_with_symbol
end
else if 'SOXX' = @symbol
begin
 bulk insert #temp_ohlcv_values
 from 'C:\DataScienceSamples\Stooq_vs_YahooFinance_for_12_tickers\12_from_stooq\soxx_us_d.csv'
 with( FORMAT='CSV' , FIRSTROW=2);
-- insert #temp_ohlcv_values into stooq_ohlcv_values_with_symbol
-- and populate symbol from @symbol
 insert dbo.stooq_ohlcv_values_with_symbol
 select @symbol symbol, *
 from
 (
  select [date]
        ,[open]
        ,[high]
        ,[low]
        ,[close]
        ,[volume]
  from #temp_ohlcv_values
 ) for_stooq_ohlcv_values_with_symbol
end
else if 'SPXL' = @symbol
begin
 bulk insert #temp_ohlcv_values
 from 'C:\DataScienceSamples\Stooq_vs_YahooFinance_for_12_tickers\12_from_stooq\spxl_us_d.csv'
 with( FORMAT='CSV' , FIRSTROW=2);
-- insert #temp_ohlcv_values into stooq_ohlcv_values_with_symbol
-- and populate symbol from @symbol
 insert dbo.stooq_ohlcv_values_with_symbol
 select @symbol symbol, *
 from
 (
  select [date]
        ,[open]
        ,[high]
        ,[low]
        ,[close]
        ,[volume]
  from #temp_ohlcv_values
 ) for_stooq_ohlcv_values_with_symbol
end
else if 'SPY' = @symbol
begin
 bulk insert #temp_ohlcv_values
 from 'C:\DataScienceSamples\Stooq_vs_YahooFinance_for_12_tickers\12_from_stooq\spy_us_d.csv'
 with( FORMAT='CSV' , FIRSTROW=2);
-- insert #temp_ohlcv_values into stooq_ohlcv_values_with_symbol
-- and populate symbol from @symbol
 insert dbo.stooq_ohlcv_values_with_symbol
 select @symbol symbol, *
 from
 (
  select [date]
        ,[open]
        ,[high]
        ,[low]
        ,[close]
        ,[volume]
  from #temp_ohlcv_values
 ) for_stooq_ohlcv_values_with_symbol
end
else if 'TQQQ' = @symbol
begin
 bulk insert #temp_ohlcv_values
 from 'C:\DataScienceSamples\Stooq_vs_YahooFinance_for_12_tickers\12_from_stooq\tqqq_us_d.csv'
 with( FORMAT='CSV' , FIRSTROW=2);
-- insert #temp_ohlcv_values into stooq_ohlcv_values_with_symbol
-- and populate symbol from @symbol
 insert dbo.stooq_ohlcv_values_with_symbol
 select @symbol symbol, *
 from
 (
  select [date]
        ,[open]
        ,[high]
        ,[low]
        ,[close]
        ,[volume]
  from #temp_ohlcv_values
 ) for_stooq_ohlcv_values_with_symbol
end
else if 'TSLA' = @symbol
begin
 bulk insert #temp_ohlcv_values
 from 'C:\DataScienceSamples\Stooq_vs_YahooFinance_for_12_tickers\12_from_stooq\tsla_us_d.csv'
 with( FORMAT='CSV' , FIRSTROW=2);
-- insert #temp_ohlcv_values into stooq_ohlcv_values_with_symbol
-- and populate symbol from @symbol
 insert dbo.stooq_ohlcv_values_with_symbol
 select @symbol symbol, *
 from
 (
  select [date]
        ,[open]
        ,[high]
        ,[low]
        ,[close]
        ,[volume]
  from #temp_ohlcv_values
 ) for_stooq_ohlcv_values_with_symbol
end
else if 'UDOW' = @symbol
begin
 bulk insert #temp_ohlcv_values
 from 'C:\DataScienceSamples\Stooq_vs_YahooFinance_for_12_tickers\12_from_stooq\udow_us_d.csv'
 with( FORMAT='CSV' , FIRSTROW=2);
-- insert #temp_ohlcv_values into stooq_ohlcv_values_with_symbol
-- and populate symbol from @symbol
 insert dbo.stooq_ohlcv_values_with_symbol
 select @symbol symbol, *
 from
 (
  select [date]
        ,[open]
        ,[high]
        ,[low]
        ,[close]
        ,[volume]
  from #temp_ohlcv_values
 ) for_stooq_ohlcv_values_with_symbol
end
 
-- truncate #temp_ohlcv_values
truncate table #temp_ohlcv_values
 
-- setup for next pass through while loop
set @symbol_number =  @symbol_number + 1
select @symbol = (select symbol 
from dbo.table_of_symbols 
where symbol_number = @symbol_number)
 
end
 
 
-- display distinct symbols in stooq_ohlcv_values_with_symbol
select distinct symbol from dbo.stooq_ohlcv_values_with_symbol order by symbol
 
-- optionally display dbo.stooq_ohlcv_values_with_symbol
--select * from dbo.stooq_ohlcv_values_with_symbol order by symbol, date

T-SQL code for importing csv files for multiple tickers from Yahoo Finance

The code for importing csv files with time series data from Yahoo Finance into a SQL Server table has the same general design as the example in the previous section for importing csv files with time series from Stooq.com into a SQL Server table. However, there are three especially critical differences. This tip section includes key excerpts that highlight differences for importing the csv files generated by Yahoo Finance instead of Stooq.com. The complete T-SQL listing for importing the time series data from Yahoo Finance csv files is available in the download for this tip.

The csv files from Yahoo Finance includes one additional column that is missing from the csv files output by Stooq.com. This additional column has a name of adj close. Recall that this column has the same close column value reported by Stooq.com, but it is also adjusted by a dividend if there is one distributed on that date. The adj close is the first of three major changes.

Although the code in this section does not permanently save the adj close column values output from Yahoo Finance, the T-SQL bulk insert statement must account for the adj close column when it is copying values from the Yahoo Finance csv files to the #temp_ohlcv_values table. This is because the bulk insert statement imports the whole file instead of just a subset of its columns. Here is an excerpt that shows the code for creating a fresh version of the #temp_ohlcv_values table for receiving csv file data downloaded by Yahoo Finance.

drop table if exists #temp_ohlcv_values;
 
create table #temp_ohlcv_values(
 date date
,[open] money
,high money
,low money
,[close] money
,[adj close] money
,volume bigint
)
go 

Here is the equivalent T-SQL code for creating a fresh version of the #temp_ohlcv_values table for receiving data from the Stooq.com csv files to the #temp_ohlcv_values table. If you examine the two script excerpts, you can plainly see that the only difference is an extra column named adj close in the script for receiving the contents of a csv file from Yahoo Finance.

drop table if exists #temp_ohlcv_values;
 
create table #temp_ohlcv_values(
 date date
,[open] money
,high money
,low money
,[close] money
,volume bigint
)
go

The next script and the comments about it highlight the differences between importing time series data in a csv file from Stooq.com versus Yahoo Finance. The code focuses on the processing of the #temp_ohlcv_values table and its transfer to a SQL Server table.

Recall from the preceding section that processing the #temp_ohlcv_values table and transferring its contents to a SQL Server table is essentially the same for each of the 12 ticker symbols in this tip. Therefore, if you understand how to process the #temp_ohlcv_values table and transfer its contents to a SQL Server table for one ticker symbol, you will understand the process for all ticker symbols. The following code excerpt is for a single branch in the if…else statement that transfers the contents of the #temp_ohlcv_values table to a SQL Server table. The single branch is for a single ticker symbol – namely, DIA, which is an exchange traded fund that tracks the Dow Industrial Average.

The 12 branches of the if…else statement are embedded in a while loop that progressively moves through the ticker symbols in the table_of_symbols table. A declare statement before the while loop and T-SQL code towards the end of the while loop updates the value of a local variable named @symbol. When the value of @symbol equals DIA, then it enters the branch that shows below. The code inside the begin…end block is for reading the contents of the csv file downloaded from Yahoo Finance.

  • Yahoo Finance automatically names the file (dia.csv) after the ticker symbol being processed. When the value of @symbol changes to a different ticker symbol, the downloaded filename will change in a corresponding way.
  • Yahoo Finance also copies the file to the Downloads folder of the current user initiating the download operation.
  • The downloaded file is subsequently moved to the C:\DataScienceSamples\Stooq_vs_YahooFinance_for_12_tickers\ 12_from_YahooFinance\ path. You should move the downloaded file to wherever your T-SQL code expects to find it.
  • The bulk insert statement transfers the contents of the dia.csv file to the #temp_ohlcv_values table. The new destination path name and new filename is the second major change for transferring csv files from Yahoo Finance to a SQL Server table.
  • The with clause for the bulk insert statement has different contents than in the example for transferring csv file contents from Stooq.com.
    • The code below is for a csv file that terminates each row with a linefeed. This is the way Yahoo Finance terminates rows in the csv files it downloads.
    • The comparable code in the preceding section is for transferring contents from csv files that are downloaded from Stooq.com. The Stooq.com website exports csv files that terminate each row with a carriage return and a trailing linefeed.
    • Sample output from each time series vendor appears in the "User interfaces for downloading time series data from Stooq.com and Yahoo Finance" section.
  • After the bulk insert statement transfers the contents of a csv file for a ticker symbol to the #temp_ohlcv_values table, an insert…select statement adds the rows to the yahoo_finance_ohlcv_values_with_symbol table. In the process of adding the rows from the contents of the #temp_ohlcv_values table, the code also populates the symbol column in the yahoo_finance_ohlcv_values_with_symbol table with the current value of @symbol. The rows added from the #temp_ohlcv_values table exclude the adj close column. The dropping of the adj close column is the third major change for transferring csv files from Yahoo Finance to a SQL Server table.

With the exception of the code for the 12 if…else branches, the remainder of the two codes is essentially the same or does not require any special commentary.

if 'DIA' = @symbol
begin
 bulk insert #temp_ohlcv_values
 from 'C:\DataScienceSamples\Stooq_vs_YahooFinance_for_12_tickers\12_from_YahooFinance\dia.csv'
 with (
    FIELDTERMINATOR = ',', 
    FIRSTROW = 2,
    ROWTERMINATOR = '0x0a'  
 );
-- insert #temp_ohlcv_values into yahoo_finance_ohlcv_values_with_symbol
-- and populate symbol from @symbol
 insert dbo.yahoo_finance_ohlcv_values_with_symbol
 select @symbol symbol, *
 from
 (
  select [date]
        ,[open]
        ,[high]
        ,[low]
        ,[close]
        ,[volume]
  from #temp_ohlcv_values
 ) for_stooq_ohlcv_values_with_symbol
end

Selected comparisons of stooq.com and Yahoo Finance time series data

The data collection phase of this tip deposited data for 12 ticker symbols into a single SQL Server table. Actually, two SQL Server tables were used – one for Stooq.com and a second one for Yahoo Finance.

The following script generates a top-level comparison of the data sources for the 12 ticker symbols.

  • There are two select statements in the script.
    • The first one is for data from Stooq.com.
    • The second one is for data from Yahoo Finance.
  • The select statement specifies a source value for each set of data as well as providing metrics on the data. Metrics are reported by ticker within a source.
    • The first metric is for the count of trading days.
    • The second metric is for the earliest date for which data is available.
    • The third metric is for the last date through January 21, 2022.
use DataScience
go
 
-- raw count of trading days, earliest trading day, 
-- and most recent trading day by symbol from stooq.com
select 
 'Stooq.com' Source
,symbol
,count(*) [trading day count]
,min(date) [earliest trading day]
,max(date) [most recent trading day]
from dbo.stooq_ohlcv_values_with_symbol
group by symbol
 
 
-- raw count of trading days, earliest trading day, 
-- and most recent trading day by symbol yahoo finance
select 
 'Yahoo Finance' Source
,symbol
,count(*) [trading day count]
,min(date) [earliest trading day]
,max(date) [most recent trading day]
from dbo.yahoo_finance_ohlcv_values_with_symbol
group by symbol

Here is a screen image of the output from the preceding script.

  • Across the 12 ticker symbols, time series data is generally available for 10 years or more. There is substantial variability in the availability of historical time series data.
    • For example, the RIOT ticker symbol is for a crypto mining company, which has data going back to August 29, 2011 from Stooq.com but only back to April 1, 2016 from Yahoo Finance. This ticker in the Yahoo Finance source is the first of two from among the 12 tickers in this tip with less than 10 years of data.
    • The F ticker symbol is for the Ford Motor Company. Time series data from Stooq.com for this ticker is available back as far as January 3, 1977. Yahoo Finance provides historical data for this ticker as far as January 2, 1990.
  • Another way of assessing the differences by ticker between the two vendors and across tickers is to scan the number of trading days for which times series are available. The MARA ticker, which is for another crypto mining company, has just 1886 trading days of historical data in Stooq.com. The same ticker has 2445 trading days of historical data in Yahoo Finance. The MARA ticker in the Stooq source is the second of two tickers with less than 10 years of data.
  • All the tickers have data through January 21, 2022. This is a hint that the availability of time series is more reliably available for more recent periods.

Overall, if you cannot find the data you seek in one source, then you may be able to find it in another source.

query results

Another way of assessing the comparability of the two data vendors is to compute coefficients of determination of the time series values by ticker symbol. This statistical tool (the coefficient of determination) helps to assess how comparable two sets of values are. A value of 1 indicates a perfect correspondence between two datasets, and a value of 0 indicates no correspondence between two datasets.

The following table shows the coefficient of determination as computed by Excel’s linear regression tool.

  • The regressions are for close prices by ticker within a year between Stooq.com and Yahoo Finance.
  • As you can see, nearly all the regressions returned a coefficient of determination value of 1 (to two places after the decimal point), which indicates a perfect correspondence between Stooq.com and Yahoo Finance.
  • Two of the symbols, F and GM, have a coefficient of determination that is very near to 1 (.93). These two are for the least recent year (2019). A prior tip also noted a pattern for the correspondence between Stooq.com and Yahoo Finance to diverge slightly from one for less recent years.
excel data

The TQQQ ticker in the preceding screen image has a coefficient of determination, which is farther away from a value of 1 than any other coefficient of determination; its coefficient of determination is .73. However, this coefficient of determination is for a more recent year (2022) than a less recent year. As a result of this, I performed a more in-depth examination of its root cause. The following Excel spreadsheet excerpt points to the cause of the difference.

  • In the early trading days of 2022 (1/3/2022 through 1/12/2022) the close price in Stooq.com (close_stooq) was about twice as large as the close price in Yahoo Finance (close_yf). The same pattern also held for 2021, 2020, and 2019.
  • In contrast, Stooq.com started reporting close prices as of 1/13/2022 and beyond that exactly matched those from Yahoo Finance. Unfortunately for the coefficient of determination value, Stooq.com did not update its historical values. Hopefully, the root cause of the lower coefficient of determination for TQQQ will be fixed in the future. Otherwise, you can make a revision in the TQQQ data that you copy from Stooq.com.
excel stock data chart
Next Steps

This tip dives down into user interface and T-SQL programmatic issues for migrating time series price and volume data from Stooq.com and Yahoo Finance to a SQL Server table. Also, some computations are performed in Excel based on data copied to a workbook file from the SSMS in SQL Server. As a consequence, the download for this tip includes csv data files, T-SQL script files, and an Excel workbook file.

  • There are 12 csv files generated by Stooq.com and another 12 csv files generated by Yahoo Finance. There is also a txt file named data_notes.txt that describes some minor fixes applied to selected csv files from Stooq.com.
  • There are 2 sql script files for processing csv files.
    • One sql script file is importing values from Stooq.com generated csv files.
    • The other sql script file is for importing values from Yahoo Finance generated csv files.
  • There is also a sql script file that demonstrates basic time series processing techniques with SQL Server as well as for pulling data from SQL Server for further processing by Excel.
  • There is one workbook file.

After you confirm that you can read and process csv files from Stooq.com and Yahoo Finance for the sample data provided with this tip, then you can generate your own sample data files for different ticker sets than those used in this tip. In addition, you can investigate ways of updating the T-SQL code for handling much larger sets of ticker symbols.




Get Started Now - Click here to get your free 14 day trial of SolarWinds Database Insights




get scripts

next tip button



About the author
MSSQLTips author Rick Dobson Rick Dobson is a Microsoft Certified Technical Specialist and well accomplished SQL Server and Access author.

View all my tips


Article Last Updated: 2022-02-24

Comments For This Article

















get free sql tips
agree to terms