March 16, 2015 at 5:00 am
Hello all,
First time poster here! I would like peoples opinions here. I am collecting daily stock data for the past 10yrs (so approximately 2500 rows of data which is not significant), however I have over 200 stocks (will grow to 1000 over time possibly) with about 30 individual fields per stock. So we are talking about a table that has originally 6000 columns that grows to have over 30000 columns one day, but more appropriately perhaps only 6000-10000 columns to start with. I recognise that an SQL database may not be the best approach, there are many 'big data' programs like mongodb, hadoop etc however I never used them (only SQL)
Can anyone shed some light on the best way to store this in SQL. I know each table is limited to 1024 columns, do people suggest I break this into possibly 10+ separate tables?
Looking forward to opinions
March 16, 2015 at 5:12 am
Does each stock type have the same 30 attributes (columns)? Please will you give some examples?
John
March 16, 2015 at 5:15 am
SQL Server is as good as any to store this data but it depends on how you want to use it.
Don't store all columns for all stocks in one row. Use a single row to store a single stock (30 columns) per date. You can then add addtional stocks without having to alter the table structure.
ID (Identity)
Date
StockCode
Col1
Col2
Col3
March 16, 2015 at 5:16 am
Yes each stock will have the same features. So for example:
-high
-low
-close
-open
-180 day volatility
-volume
-market cap
-RSI
etc etc
The features themselves are all numeric in nature, though sometimes there may be null values as the data didn't exist. The first column will be the date
March 16, 2015 at 5:21 am
roger.price-1150775 (3/16/2015)
SQL Server is as good as any to store this data but it depends on how you want to use it.Don't store all columns for all stocks in one row. Use a single row to store a single stock (30 columns) per date. You can then add addtional stocks without having to alter the table structure.
ID (Identity)
Date
StockCode
Col1
Col2
Col3
Roger could you please elaborate further. I don't quite understand how you would add additional stocks to the this table structure. I've always predefined my columns and their type.
March 16, 2015 at 5:25 am
OK, so you don't need 6000 columns. Your table will look a bit like this:
CREATE TABLE StockData (
CheckDate date NOT NULL,
StockCode int NOT NULL,
StockHigh decimal(9,2) NOT NULL
StockLow decimal(9,2) NOT NULL
StockClose decimal(9,2) NOT NULL
StockOpen decimal(9,2) NOT NULL
Volatility int NULL
Volume int NOT NULL
MarketCap decimal(9,2) NULL
RSI decimal(9,2) NOT NULL
)
ALTER TABLE StockData ADD CONSTRAINT PK_StockData_CheckDate_StockCode
PRIMARY KEY (CheckDate, StockCode)
John
March 16, 2015 at 5:31 am
Excellent! You guys solved my problem. I made the problem much harder for myself and you guys simplified it. Thankyou
March 16, 2015 at 6:25 am
I'd strongly suggest you look into database design and normalization processes. Using referential tables you shouldn't be looking at tables with 100 columns, let alone 10000. Louis Davidson is one of the best people I know at teaching design. Here's his book.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply