Answers To My SQL Questions
I provided some SQL questions for an interviewer in
Who Do You Hire? to ask of a candidate. I had a couple requests for answers, so
I decided to provide some. I would recommend you check the questions first before you read
these.
- This is a cross product or a full outer join. The number of rows is 5 * 3 = 15
- A correlated subquery is a query within a query where the inner query is evaluated for each row in the outer query.
select o.orderid from orders o where o.custid = ( select o1.custid from orders o1 where o1.qty > 1 and o1.orderid = o.orderid )
select distinct c.name, o.orderid from customers c, orders o where c.custid = o.custid
select c.name, o.orderid, count(o.orderid) from customers c, orders o where c.custid = o.custid group by c.name, o.orderid having count( o.orderid) > 1
- RI is explicitly declaring the relationships between tables (foreign keys) using DDL and the built-in server functions rather than using triggers or stored procedures to enforce relationships.
Normalization is a technique for storing data in a relational database system.
There are various levels or normalization, but when most people speak of
Normalization, they are referring to third normal form. Under this level,
there are no repeating data elements, fields depend on the complete primary key,
and no dependencies on non-key columns. Also no derived columns.
Usually third normal form reduces the amount of data storage needed as well as
makes OLTP applications quicker by reducing the amount of data necessary
for an insert.
Denormalization is often done to reduce the number of joins required for a
query. Usually in OLAP applications.
- I would perform nightly full backups, differential backups every four hours between fulls, and transaction log backups every hour in between. If additional protection against loss of data was needed, this schedule could be changed to reduce the time between transaction log backups.
I would perform all backups to a local disk using native SQL dump. Then these would be copied to a remote server and backed up to tape from that server.
- There are two ways to do this. If the event is tracked by SQL Server, an alert can be set to notify someone with email / pager / broadcast. If this is a business alert, then code would have to be written using a stored procedure to manually send the alert.
- Have a stored procedure that checked for large sales, by whatever definition is being used for large. This procedure would send an email alert to the sales manager. I would schedule a task using SQL Agent / SQL Executive to run this task as often as needed.
- Online Analytical Processing. Usually used in to describe the client access to a "cube" of precalculated data that is designed for reporting purposes. The data is often stored in a star schema that allows for rapid access and manipulation of data based on dimensions and calculated measures the cube designer has incorporated into the cube.
- Given a basic requirement without knowing additional details, here is what I would propose:
Setup the following partitions:
- RAID 1 partition for the OS and SQL OS
- RAID 1 partition for the pagefile
- RAID 1 partition for each log file
- RAID 5 partition for each data file group -
heavily used tables could be placed on their own RAID 5 partion and filegroup
I would 2 dual processor machines in a cluster configuration with 2GB of RAM each.
- Clustered indexes are indexes where the data rows are actually the leaf nodes of the index. The data is then physically stored in the indexed order.
Only one clustered indexes per table
255 additional nonclustered indexes per table.
- In query analyzer, view the execution plan and look for scans rather than seeks. Perhaps additional indexes would help. Update the statistics if there are indexes
on these tables.
I would also ensure that the tables are not fragmented by checking dbcc showcontig and perhaps reorganizing the table.
- Replication is the process by which changes to a tables in a database as automatically moved to another database by the SQL Server processes.
There are single-single, single-many, many-single, and multi-merge replication
If you have data on one server that you need reflected on another server, replication can ensure that the data gets moved.
Steve Jones
December 2000