We had a solar system installed at our house this year. I’m excited to see how this performs, as our estimates and research shows this ought to be a good financial decision for us over time. While the hardware that came with the system includes some monitoring and reporting, I wanted to track things independently to be sure that I have the data. I know many of these companies might not be as prepared for an issue as I would like, and if they lose some of my historical data, I’m not sure they care.
I decided to set up a small database, which will need an import process along with reporting and this is the first in a series of posts on how I’m addressing the database design. In this post, I’ll look at the initial tables I created.
Estimated Production Table
My system included some estimated levels of production for the year, and I will be able to record the actual levels each day. I decided to track these two sets of data separately for a couple reasons.
First, the estimates are monthly, and they do not vary. While I could just stick this data in the same table, it’s a lot of wasted data. Not a lot of space, but still, I decided to be efficient here. The estimates I have are a total for each month, with the math done to give me a daily power level. I decided to create this table:
CREATE TABLE [dbo].[SolarPowerEstimate](
[TrackingKey] [int] IDENTITY(1,1) NOT NULL,
[trackingmonth] [tinyint] NULL,
[estimate_month] [numeric](6, 2) NULL,
[estimate_daily] [numeric](4, 2) NULL,
CONSTRAINT [SolarPowerPK] PRIMARY KEY CLUSTERED
(
[TrackingKey] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
This table has a PK just to keep things simple, and then I have a month number, which tracks for which month I have an estimate. There should only be 12 months in this table, as the estimate is supposed to repeat each year. I included the numeric values for the month and daily levels.
The data in this table looks like this:
I can join this with my actual production to compare how well things are working.
Actual Production
For Actual production, there is a value for each day. As a result, I need a date and a numeric value. I decided to separate out the date into separate parts, as I can always combine those, but this is really a data warehouse structure for me and I want to quickly join this with my estimate. I also expect to do some reporting by month, so having the month separated out (and the year) is a quick way to join data without needing a function.
CREATE TABLE [dbo].[solarpoweractual](
[TrackingKey] [int] IDENTITY(1,1) NOT NULL,
[trackingyear] [smallint] NULL,
[trackingmonth] [tinyint] NULL,
[trackingday] [tinyint] NULL,
[actual_daily] [numeric](10, 4) NULL,
CONSTRAINT [SolarPowerActualPK] PRIMARY KEY CLUSTERED
(
[TrackingKey] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
This table will be populated with numbers for the date parts and then the production value. Right now, I see data like this:
I’ll go over reporting and how I use this data in another post, but there is one more table I need for this system.
Staging Imports
I can download data daily, but I really don’t care about the flows of the data each day. The data is reported each 15 minutes, but that’s a bit granular for me. Instead, I want to download monthly data. If I do that, I get a row for each day of the month, but some days are 0 if they are in the future. The current day is also incomplete until the sun goes down, so I may need to update that data regularly.
Rather then try to parse the data and build a complex ETL process, I’m aiming for an ELT, with a T that moves data from a staging table to my actual table with an upsert process.
The csv I get from my monitoring system is a date and a numeric value, so I built a staging table like this:
CREATE TABLE [dbo].[SolarStaging](
[Time] [date] NOT NULL,
[System_Production_Wh] [varchar](50) NOT NULL
) ON [PRIMARY]
GO
My aim here is to truncate this table, load the entire CSV, and then transform data as needed.
Summary
That’s the basics of my solar tracking database. I have a place to land new data, a table for the estimates I have for each month of the year, and then a table that is essentially a fact table of actual values.
I’ll add more details on how I load data, as well as how to analyze the data over time.