March 20, 2011 at 11:41 pm
I have run into a discussion with a work mate of mine who reckons it is a good idea to have pseudo records but I am trying to understand industry standards/best practice or a good reason for having pseudo records.
Business case:
We import daily share prices for companies and if a company is in trading halt we do not import the volume traded as a record is not given by the vendor for that day. However for performance issues my work mate reckons that we should still have a pseudo record inserted with the previous close price. Agreed that this will improve performance however the hassle of data correction will be huge if a file correction is sent after 1 week.
Are people in favour of pseudo records for performance reasons or they prefer to import the exact data supplied by the vendor and for data retrieval purposes apply business rules in a view.
Your ideas much appreciated.
March 21, 2011 at 2:05 am
I would say, it depends...
What do you do now if a file correction comes in after 1 week on records that aren't falsified?
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
March 21, 2011 at 4:35 am
It can depend on your exact situation, but in general i would say that it is not a good idea, as you want the data in your database to be real.
If you need to add records to make a report/dataset complete then you should do this on the way out using views or stored procs.
What is the performance benefit from adding this dummy data?
could the performance issues be addressed in another way by tuning the queries?
have you checked the difference in execution plan between queries with dummy data, and without?
March 21, 2011 at 5:49 am
I'm with Steve. I wouldn't advocate putting artificial data, especially financial data, into the database. It's just too likely to cause problems down the road.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
March 21, 2011 at 3:59 pm
Thanks everyone for your response.
I was of the opinion that adding pseduo records may result in unnecessary maintenance especially during reruns and data correction.
The psuedo records are constructed on the fly in views. Yes definetely there is a performance impact when comparing reading from a direct table to putting the logic in the views or stored procedures.
This solution of no pseduo records has been there for more than 5 years, it was just a different idea from my colleague and i was just wondering if people have different opinions and approaches to this problem.
Thanks again.
March 23, 2011 at 10:56 am
Aparna,
Yours is a classic example of reporting on something that is usually there, but for some reason isn't there now. I've seen what you call pseudo records (I call them dummy records) used in transactional data and they have always led to problems. I NEVER mess with transactional data - keep it real.
A better approach would be to have a table of recently active companies and use that as your anchor to left join to the transactional data. The company would appear since it was recently active, but there would be no (or zero) transactions for it. If you don't want the company to appear in the transactional data, then change its status to in-active or remove it from the recently active companies table.
This way you always have transactional integrity and don't have to clean up your data later.
This is not unlike a daily sales report by salesman. You use the salesman table to start with as the anchor and left join to sales for the day. If the salesman had a day off, you still see him but there are no sales for him.
Todd Fifield
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply