Getting data into SQL Server

  • Hi,

    Working for a small organisation and we now have a need to stop using spreadsheets to store data and move to a database. I'm a Data/BI Analyst, but my SQL skills really stop at querying to get data out. A lot of the more manual research work going on relies on data entry. 

    3 questions;

    1) How easy will it be for me to set up a single database, with a few tables? I am aware of the need for Primary Keys etc, but not sure how to implement this. More than willing to learn if there are any decent resources. At present this really only needs to be on my machine, I have got the local instance set up but with no data (aside from WideWorldImporters).
    2) How can I easily import our existing Excel data into SQL server, I have tried using the import wizard but get an error message ' The 'Microsoft.ACE.OLEDB.12.0' provider is not registered on the local machine. (System.Data)', I have tried using both the 32 and 64 bit wizard, neither work. Is there any software better suited to this?
    3) For capturing future data which would otherwise be entered into a spreadsheet are there user form type tools which can be created for researchers to directly enter data into the database?

    Some of you might point out I am missing some steps in here, so any advice will be greatly received.

  • Setting up the tables isn't necessarily difficult, but getting the design of them correct is the MOST important part.   If you can share some sample data and describe in detail what the business needs are with regard to data entry, business process, and reporting, we can probably provide more detailed answers on how to set up those tables.   As to importing Excel data, that requires that your data conform to some rules, such as consistent data types within columns, as well as a single row of column headers that doesn't duplicate column names.

    With regard to future data, you might want to take a look at creating an MS Access front end for your SQL database.  I've done that quite frequently for smaller applications like this and it works pretty well.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • 1) Creating tables is simple. Getting them to fulfil your requirements, ensure that data integrity between each is maintained, they're efficient, etc, etc, is the challenge. your question is quite open ended, but if you could be a bit more consice on your needs, I'm sure others have plenty of links they could provide you.
    2) Sounds like you don't have the Microsoft Access Database Engine 2010 Redistributable Drivers installed.
    3) Personally I dislike Spreadsheets inserting Databases. I would rather use Access with forms connecting the the SQL Server, or custom software. If you want to get away from Spreadsheets, don't give your researchers the option of using them to continue to insert data.

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • A few example tables attached. (this is just a couple of rows of dummy data for each example table). There would be a few more columns for each table. There would be an additional table populated by the userform which would have person level information.

    The manual entry part is usually a bespoke element for a project, although I would see no problem in using a userform where the bulk of the additional data is along the same lines and if not relevant for a particular project then that question on the form is left blank.

    If it helps for context we are a HR/Recruitment Consultancy and the projects are heavily insight based. We use PowerBi for presenting data so I can easily connect SS to PowerBi.

  • james.h291183 - Wednesday, April 5, 2017 8:11 AM

    A few example tables attached. (this is just a couple of rows of dummy data for each example table). There would be a few more columns for each table. There would be an additional table populated by the userform which would have person level information.

    The manual entry part is usually a bespoke element for a project, although I would see no problem in using a userform where the bulk of the additional data is along the same lines and if not relevant for a particular project then that question on the form is left blank.

    If it helps for context we are a HR/Recruitment Consultancy and the projects are heavily insight based. We use PowerBi for presenting data so I can easily connect SS to PowerBi.

    The relationship between the data in each of the tabs is not immediately obvious in many cases.   You'll need to be fairly explicit about what this data is supposed to represent, and what / how you use the data from each tab.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Can we make a couple of "safe" assumptions? As there are no key fields in your spreadsheet, and no data validations, can we rely that every Company's name on the Company sheet will be unique? If not, will Company & Location be unique? Also can we assume that on the People/Live Roles sheets, the Company/Employer field will match exactly what the name of the company is on the company sheet? If Company & Location is your unique identifier, how do you know if John Smith works for ACME London or ACME Manchester?

    Also, can I assume that the column "No. of Employees" having values of "Head of IT" and "head of Finance" on the Company table is an error?

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Yes it is an error, that was me trying to copy and paste to speed things up. 

    I can add data validation to ensure that things are entered correctly. I would want to make sure that unique names are kept unique.

  • If I was joining tables, I would typically only be looking for relationships between people or companies.
    For the career history tab I would look at things like time in role, max time, min time. This might be at a person or company level, there would be a sector field in there as well and would be grouping by sector etc.

    At a company level, would be maybe looking for connections between size of company by no. of employees or revenue and then in the news table by no.of job gains or losses for example. We are a consultancy so every project differs, but the same types of analysis are often carried out.

  • I completely agree that you should not use spreadsheets to store data. As suggested you can use Access as a front end to SQL server
    but you need to learn how to do it. But you can stil use a spreadsheet as front end to a database. Excel has a lot of functionality and VBA
    language which is very useful. You don't need many rows of code to import tables of data in Excel to tables in SQL server and you can design
    forms with drop down data evaluation. You can lock the sheets so the user cannot destroy the design. I have designed several budgeting systems with
    this approached. As you mention Power BI is very useful to analys in this case the budget data and the actual data from the general ledger also
    are stored in the database. The "drawback" you need to learn. May be I can be of assistance?
    Gosta M

  • Gosta Munktell - Thursday, April 6, 2017 2:19 AM

    I completely agree that you should not use spreadsheets to store data. As suggested you can use Access as a front end to SQL server
    but you need to learn how to do it. But you can stil use a spreadsheet as front end to a database. Excel has a lot of functionality and VBA
    language which is very useful. You don't need many rows of code to import tables of data in Excel to tables in SQL server and you can design
    forms with drop down data evaluation. You can lock the sheets so the user cannot destroy the design. I have designed several budgeting systems with
    this approached. As you mention Power BI is very useful to analys in this case the budget data and the actual data from the general ledger also
    are stored in the database. The "drawback" you need to learn. May be I can be of assistance?
    Gosta M

    The problem with "passwording" an excel document is it doesn't really stop anyone from editting it. A really simple way is to copy everything on the sheet(s), and paste it into a new workbook. Then you have a completely no holds bar version of the Excel document. Then, if you want to be really "stupid", you save over the original document, and now EVERYONE has a copy of the unsecured/unprotected file.

    Although I'm playing devils advocate, I'm talking about this from experience.  Both in my current job, and previous, I've had to build somewhat complex spreadsheets for people, which has VBA macros, data validations, etc, etc, and all the protections are set up. then, I get a phone call "Hi Thom, the spreadsheet isn't working anymore. can you fix it?" I open it up, and first thing I notice, all the hidden sheets are shown, well, all of them apart from one of the data validation ones, which breaks the sheet and they can't enter data in it. Then, in another column, someone removed the data validation, and there's values like "Insuer" instead of Insurer, breaking other parts of the workbook.

    Excel is not a secure environment for building an "application", it certainly doesn't handle multiple user environments well. For sharing of tabular data, fine, it does that well, but Excel is not a Front End User Interface for managing your data or business. There are far more robust and better options. Like I said above, Access seems like a really good option for you here. you can build a simple Front end Form application for your users, with lowered permissions (which is a breeze in Access, and you can't get around it anywhere near as easily), and they can go "wild". If you ensure you have good data validations, and "idiot proof" it where you can, you'll be able to rely on the system much more than any Excel document.

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Thanks, fairly certain I could learn the Access element to this. Any decent resources available?

    One additional issue if I am giving users a front end to populate a DB that is on my machine, is this doable or do we need to have the database somewhere else?

  • Thom
    I agree. Excel is not a full proof and recommended front end if we are talking about a multiuser system for daily work.
    In my case the users are educated cost center responsible staff people and the communication to the database is
    bidirectional with a lot of calculations and sometimes simulation. The functionality in Excel is then very useful.
    If the purpose is to just to enter data in forms I would use Access as a front end but also look at Webb based solution.
    Gosta M

  • Ok, firstly, I'm going to admit, I've done this more for other readers feedback rather than the OP's (sorry). Table design isn't something I really get to do at work, as the application we use is 3rd party and based on mainframe data (with barely any foreign keys, and certainly no check constraints). This is, however, the kind of set up I would imagine. I make a couple of comments below, in regards to missing tables, for example JobTitle should be separated, and possibly Locoation. The indexes I've done are simple, and this is where firstly my knowledge is low (as I don't get to really make them here), and secondly requires knowledge of the data. I've guessed, based on the Company seems to be the "winning" field.

    Note that I also have an identity column, as names are not unique, and never will be. As i said above, you'll need to consider this is you have companies with the same name (hopefully different locations), and how you're going to match employees to the correct one. If Joe works for Tesco, is he working to Tesco Thurrock, or Tesco Rainham in your document (maybe it's not a problem right now, but keep it into consideration).

    Anyway, I await much grilling from some people 🙂 If Mr Celko comes along I really will appreciate any the constructive feedback you give 🙂
    USE DevtestDB;
    GO

    CREATE TABLE SSC.Company
      (CompanyID int IDENTITY(1,1)
       CONSTRAINT Company_PK PRIMARY KEY NONCLUSTERED,
      CompanyName varchar(200) NOT NULL,
      CompanyLocation varchar(100),
      EmployeeNum int NOT NULL, --This might not actually be needed if it can be dervived from the people table
      Turnover decimal(16,2) NOT NULL,
      Profit decimal(16,2) NOT NULL,
      ProductLines int,
      Website varchar(100));

    CREATE CLUSTERED INDEX CompanyName_IX ON SSC.Company (CompanyName ASC)
    GO

    CREATE TABLE SSC.Person
      (PersonID int IDENTITY(1,1)
       CONSTRAINT People_PK PRIMARY KEY NONCLUSTERED,
      PersonName varchar(100) NOT NULL, --Ideally you should want to split these out
      PersonLocation varchar(100) NOT NULL,
      JobTitle varchar(50) NOT NULL, --This should probably be a foreign key, but using your current design I have not done this
      CompanyID int NOT NULL
       CONSTRAINT Person_Company_FK FOREIGN KEY REFERENCES SSC.Company (CompanyID),
      StartDate date NOT NULL,
      PreviousRole varchar(50) NULL, --Assumed that an employee may not have a previous role. Do you only ever store the last one? If more than one, this should go else where
      PreviousCompanyID int NULL
       CONSTRAINT Person_PreviousCompany_FK FOREIGN KEY REFERENCES SSC.Company (CompanyID),
      PreviousStartDate date NULL,
      PreviousEndDate date NULL,
      Email varchar(100) NOT NULL,
      Telephone varchar(15) NOT NULL,
      Age smallint NOT NULL,
      Salary decimal(9,2));

    ALTER TABLE SSC.Person CHECK CONSTRAINT Person_Company_FK;
    ALTER TABLE SSC.Person CHECK CONSTRAINT Person_PreviousCompany_FK;
    CREATE CLUSTERED INDEX CompanyID_IX ON SSC.Person (CompanyID ASC); -- I have guessed this one, as I do not really know what you are most likely to be searching on

    GO

    CREATE TABLE SSC.News
      (NewsID int IDENTITY(1,1)
       CONSTRAINT News_PK PRIMARY KEY NONCLUSTERED,
      CompanyID int NOT NULL
       CONSTRAINT News_Company_FK FOREIGN KEY REFERENCES SSC.Company (CompanyID),
      Headline varchar(200) NOT NULL,
      Summary varchar(max) NOT NULL, --I assume tehre could be a lot in here
      JobGains int NULL,
      JobLosses int NULL,
      NewsDate date NOT NULL);

    CREATE CLUSTERED INDEX CompanyID_IX ON SSC.News (CompanyID ASC); --I have guessed this one, as I do not really know what you are most likely to be searching on

    GO

    CREATE TABLE SSC.LiveRole
      (RoleID int IDENTITY(1,1)
       CONSTRAINT Role_PK PRIMARY KEY NONCLUSTERED,
      CompanyID int NOT NULL
       CONSTRAINT LiveRole_Company_FK FOREIGN KEY REFERENCES SSC.Company (CompanyID),
      JobTitle varchar(50) NOT NULL, --Now that I have got here, you really do want a JobTitle table.
      Salary decimal(9,2) NOT NULL,
      --Location, this should be picked up from the Company ID, you do not need it here.
      DatePosted date);

    CREATE CLUSTERED INDEX CompanyID_IX ON SSC.LiveRole (CompanyID ASC); --I have guessed this one, as I do not really know what you are most likely to be searching on

    GO
    --Couple of sample inserts, however, you will need to create an ETL process to get your data in.
    INSERT INTO SSC.Company (CompanyName, CompanyLocation, EmployeeNum, Turnover, Profit, ProductLines, Website)
    VALUES ('A1','London',20,234519432.00, 1426232.00, 14, 'www.A1Products.com');
    GO

    INSERT INTO SSC.Person (PersonName, PersonLocation, JobTitle, CompanyID, StartDate, PreviousRole, PreviousCompanyID, PreviousStartDate, PreviousEndDate, Email, Telephone, Age, Salary)
    VALUES ('Joe Bloggs', 'London', 'DBA', 1, '20150101', NULL, NULL, NULL, NULL, 'J.Bloggs@A1Products.com', '02071234567', 35, 49500);
    GO
    --Very Simple Select Statement

    SELECT P.PersonName, P.Salary, P.Age
    FROM SSC.Person P
      JOIN SSC.Company C ON P.CompanyID = C.CompanyID
    WHERE C.CompanyName = 'A1';

    GO
    --Clean up

    DROP TABLE SSC.LiveRole
    DROP TABLE SSC.News;
    DROP TABLE SSC.Person;
    DROP TABLE SSC.Company;

    GO

    Sorry for the minimal tabbing. When I paste into SSC, double/triple spaces are compressed to a single space. It really peeves me off, but I am not fixing it all in the SSC post editor...

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • James
    This might be a solution:
    If the users have a Access client with the front end, the database SQL server can the hosted in a server within your Intranet
    (or in the cloud)
    It is not recommended that this server is your own PC. Be aware of back up.
    Gosta M

  • Gosta Munktell - Thursday, April 6, 2017 4:13 AM

    Thom
    I agree. Excel is not a full proof and recommended front end if we are talking about a multiuser system for daily work.
    In my case the users are educated cost center responsible staff people and the communication to the database is
    bidirectional with a lot of calculations and sometimes simulation. The functionality in Excel is then very useful.
    If the purpose is to just to enter data in forms I would use Access as a front end but also look at Webb based solution.
    Gosta M

    Excel is fine for data extraction. I would have them to insert/manage data in Access 9or similar), but Excel is fine for manipulating it and creating some ad hoc reports or analysis. Excel can use external data source connections, so you could easily set up ways for them to connect via that. It would just to one way (output only, no insert, which, in my opinion, is ideal).

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply