Keeping user database in memory

  • GilaMonster (1/26/2009)


    serge.kovacs (1/26/2009)


    Maybe I wasn't clear on the inner workings of the application on start-up. It will load all data from all tables so there should be no need to use indexes. Every table is completely loaded into the memory of the application (simply everything, all columns, all rows)

    Why?

    If that's what you're doing, you may as well be using flatfiles to store the data, you're not taking any advantage of the power of a relational database

    If it was only the reading, that would be correct... however, there are also updates, inserts & deletes. For this the PK index is used of course. You must see this as the persistent state of the application. No transaction in the application is complete untill the transaction in the DB is complete. If a transaction fails we need to roll-back the part of the transaction already sent to the DB. It's all possible with flat-files, but a DBMS comes in handy here πŸ˜€

  • I have seen similar scenarios and I I know that the proplem is APP design.

    Are you sure your app needs ***ALL*** before it starts.

    I have been working with SCADA Systems and Real-time application engineers in the past.

    Lazy initialization and differential state reads techniques solved the problem "dramatically"

    If your APP needs the entire DB in memory you need to FIX it or it won't scale.

    Good Luck


    * Noel

  • noeld (1/27/2009)


    I have seen similar scenarios and I I know that the proplem is APP design.

    Are you sure your app needs ***ALL*** before it starts.

    I have been working with SCADA Systems and Real-time application engineers in the past.

    Lazy initialization and differential state reads techniques solved the problem "dramatically"

    If your APP needs the entire DB in memory you need to FIX it or it won't scale.

    Good Luck

    Well, I'm pretty sure my app needs it all in memory... and I'm not worried about scaling either. Of course we could have a nice discussion about that here, but I don't think that is the purpouse of this board (or at least not of this thread).

    But just for fun of it all, I'll give you just one or two features of this app that might change your mind:

    many users connect to this app and have visual representation of the data, every client loading all the data since most users are viewing in one way or another about 80% of the data. Scaling here is done via Dispatchers to ensure that the initial load of more then 100 users doesn't overload the network, or blocks access on the app server. Dispatchers also load up all data at once, since different users need to connect to them and you don't know what data they'll be needing, but they do need it FAST. Considering that the dispatchers want to load there data from the server, the server has to provide ALL data at once. So no profit from a lazy load.

    The data stored in the tables is used to instantiated objects, but for these objects, or even attributes of these objects a complex dependency graph is being constructed, linking quite a lot of instances with a whole lot of other instances (depending on the model); since we need to load related objects in order to construct this graph fruitfully, we're already in a situation were the initial load would require about 75% of the data... hence, not much use from a lazy load.

    To conclude, the load happens normally only when the app is started; this is not a client application, but the server side... so this isn't something you do just for fun. The set of data is limited an well tought of so that scaling isn't an issue. However, because of a DBMS doing what it should do, but not being the thing we would prefer for startup, the startup takes too long (or at least according to this particular client).

  • serge.kovacs (1/27/2009)


    noeld (1/27/2009)


    I have seen similar scenarios and I I know that the proplem is APP design.

    Are you sure your app needs ***ALL*** before it starts.

    I have been working with SCADA Systems and Real-time application engineers in the past.

    Lazy initialization and differential state reads techniques solved the problem "dramatically"

    If your APP needs the entire DB in memory you need to FIX it or it won't scale.

    Good Luck

    Well, I'm pretty sure my app needs it all in memory... and I'm not worried about scaling either. Of course we could have a nice discussion about that here, but I don't think that is the purpouse of this board (or at least not of this thread).

    But just for fun of it all, I'll give you just one or two features of this app that might change your mind:

    many users connect to this app and have visual representation of the data, every client loading all the data since most users are viewing in one way or another about 80% of the data. Scaling here is done via Dispatchers to ensure that the initial load of more then 100 users doesn't overload the network, or blocks access on the app server. Dispatchers also load up all data at once, since different users need to connect to them and you don't know what data they'll be needing, but they do need it FAST. Considering that the dispatchers want to load there data from the server, the server has to provide ALL data at once. So no profit from a lazy load.

    The data stored in the tables is used to instantiated objects, but for these objects, or even attributes of these objects a complex dependency graph is being constructed, linking quite a lot of instances with a whole lot of other instances (depending on the model); since we need to load related objects in order to construct this graph fruitfully, we're already in a situation were the initial load would require about 75% of the data... hence, not much use from a lazy load.

    To conclude, the load happens normally only when the app is started; this is not a client application, but the server side... so this isn't something you do just for fun. The set of data is limited an well tought of so that scaling isn't an issue. However, because of a DBMS doing what it should do, but not being the thing we would prefer for startup, the startup takes too long (or at least according to this particular client).

    So you are saying that your app needs 2GB of RAM minimum (not counting the OS memory ) on the client to run!

    You are also saying that 2GB must be moved FAST from the Server to **all** clients.

    You are also saying you need to bring 300 tables (according to your own post) at once to **all*** clients

    Does that really sounds Ok to you ?


    * Noel

  • We'll, I do say that this is OK.

    The clients currently turn at around 1.6 GB stable, while peaking at around 2.2 GB.

    The clients usually take less then 40 secs to loadup all data and within a minute they are operational (this isn't MS Word or some other lame app) This is thanks to the dispatcher structure which allows scaling of the number of clients.

    As soon as the data has been loaded by the server there is no thinking in terms of tables or records anymore. From there on it is far more interesting OO πŸ™‚ So instead of looking at tables, you need to look at the contents and meaning of this data. From this perspective, 3 GB wouldn't be overkill.

    So, yes, the memory usage is high, but that is the price the customer wants to pay for the performance of the clients (don't think about hunderds of desktops with a lot of memory, think central server group with Citrix Farm). Funny enough the customer is much harder to convince of the fact that it isn't possible to 'force' the DBMS to keep a table in memory.

    We also offer another solution where the clients are so called 'thin-clients'... only loading the data that is being displayed on screen (and also receiving updates for this). But as can be expected, even scrolling down a complex list isn't instant. You'll have a split second delay for each group of lines that comes into sight (something the customer wasn't prepared to put up with).

    So, instead of trying to argue that the application shouldn't function in a certain way, or that it is wrong to do this or that, try to be constructive and try to see that some applications are in need of uncommon features. Or asume that they are and see that it is actually a missing option in the DBMS not to support a 'pinning' mechanism.

    It's not because something is potentially dangerous that the user should be shielded from it. In certain controlled environments you can make sure that there is enough room in the buffer to load all the data and still have enough place in the buffer for all the other actions.

    Does that really sounds Ok to you ?

  • LCAD (1/27/2009)


    We'll, I do say that this is OK.

    ......

    So, instead of trying to argue that the application shouldn't function in a certain way, or that it is wrong to do this or that, try to be constructive and try to see that some applications are in need of uncommon features. Or asume that they are and see that it is actually a missing option in the DBMS not to support a 'pinning' mechanism.

    It's not because something is potentially dangerous that the user should be shielded from it. In certain controlled environments you can make sure that there is enough room in the buffer to load all the data and still have enough place in the buffer for all the other actions.

    Does that really sounds Ok to you ?

    Based on my years of experience in enterprise applications, you seldom get the exact feature you want from most of the frameworks or 3rd utilities. And the job of an application designer is to leverage available tools to meet the requirements.

    In this case, it is already known that table pinning is unavailable in SQL 2005. On the other hands, there might be quite a few practical workarounds, some of them having been proposed by previous posters. Since bashing SQL Server won't get you any help, you may as well consider the workarounds if you really want to solve the problem.

    If the loading speed of the external DB is out of your control, you should naturally consider "local" solutions. Can you have a local db that mirrors (not necessarily sql-server-mirroring) the remote db? If yes, I'm sure the initial loading time will be much faster. Can you somehow break the in-memory structure into multiple smaller pieces so that you can save them in the local file system? If yes, you can manage a local file-based cache so that you don't have to load from the remote db on startup.

    I'm just presenting the approaches I'd take to attack this problem.

  • Yes, definitely increase the RAM for the SQL server if you can.

    Also, if possible isolate the physical db files on their own drives, and increase the read part of the cache to 90-95%. Then, even if i/o is required from SQL's side, there won't be an actual i/o delay from the disk side -- it will already be in the drive's own cache.

    And, try to place the data and logs on separate drives, and be sure to use RAID 1 (or 10) for the logs; the data would be best on RAID 10, but RAID 5 will do.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Maybe locking pages in memory could help you.

    Look at these:

    How to: Enable the Lock Pages in Memory Option (Windows)

    Or here[/url]

    also look at this... for what can be wrong..

  • There are crazy HW solutions to this problem like the use of "RAM Drives" but you will be risking integrity with them

    IF you are willing to get such risk go-for it.

    You could also use "merge" replication so that your Citrix Farm reads for the "local" database and all network badwidth is out of the question.

    Now my view is that you can't expect SQL Server to have a "warm" cache if you don't access it.

    Good Luck,


    * Noel

  • Just one point that as the data is in cache you don't need indexes - I'd like to meet the person who suggested this gives performance - pah! Even if data is in memory you still need indexes to optimise the queries otherwise you'll run into issues with page reads. I'd monitor page lookups, this counter is limited by the hardware platform ( bus speed/bandwidth etc. ) if this is very high at certain times of poor performance then this might be your bottleneck. Technically the buffer cache hit ratios should tell you how well your data is in cache. combine this with monitoring physical disk reads to see if you're fetching from disk. There are two factors which limit loading data from disk, bandwidth and iops, a single scsi/sata/hba will give about 3-4Gb; depending on the number of drives this might be inadequate but probably not for a tiny database - so more spindles will help with data loads, you don't need raid 10 for read performance, raid 5 is more than adequate for reads ( poor for writes but that's not what we're discussing ) so increase the spindles in your array - try maybe 7 in a raid 5 or 12 in a raid 10 ( will give about the same read performance ) - set any controller cache to 100% write ( not read as this actually can slow reads ) and finally give your server more memory, you need at least 4 gb for the o/s and maybe 8GB for SQL Server, increase o/s to 8GB if x64.

    You would do well to check you have lock pages in memory enabled in this situation ( not availble in x64 std edition )

    Failing that I'd invest in SSD you'd need to spend a few thousand dollars/pounds but you do get quite a good performance ( from my limited tests with a SSD in my laptop )

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • colin Leversuch-Roberts (1/28/2009)


    Just one point that as the data is in cache you don't need indexes - I'd like to meet the person who suggested this gives performance - pah! Even if data is in memory you still need indexes to optimise the queries otherwise you'll run into issues with page reads. I'd monitor page lookups, this counter is limited by the hardware platform ( bus speed/bandwidth etc. ) if this is very high at certain times of poor performance then this might be your bottleneck. Technically the buffer cache hit ratios should tell you how well your data is in cache. combine this with monitoring physical disk reads to see if you're fetching from disk. There are two factors which limit loading data from disk, bandwidth and iops, a single scsi/sata/hba will give about 3-4Gb; depending on the number of drives this might be inadequate but probably not for a tiny database - so more spindles will help with data loads, you don't need raid 10 for read performance, raid 5 is more than adequate for reads ( poor for writes but that's not what we're discussing ) so increase the spindles in your array - try maybe 7 in a raid 5 or 12 in a raid 10 ( will give about the same read performance ) - set any controller cache to 100% write ( not read as this actually can slow reads ) and finally give your server more memory, you need at least 4 gb for the o/s and maybe 8GB for SQL Server, increase o/s to 8GB if x64.

    You would do well to check you have lock pages in memory enabled in this situation ( not availble in x64 std edition )

    Failing that I'd invest in SSD you'd need to spend a few thousand dollars/pounds but you do get quite a good performance ( from my limited tests with a SSD in my laptop )

    The funny thing is that LCAD is using a VM with 2GB of ram ... πŸ™


    * Noel

  • My concern is actually data growth. The database may only be 2GB today, next year it could be 4GB. Seems like you end up constantly adding more hardware to support the application.

  • Lynn Pettis (1/28/2009)


    My concern is actually data growth. The database may only be 2GB today, next year it could be 4GB. Seems like you end up constantly adding more hardware to support the application.

    data growth isn't an issue here (luckily). Since we have an extremely controlled environment the amount of data won't grow more then 10% in 5 years πŸ™‚

    Bad part is that for some reason the customer still has some sys-admins that won't give in to the case for a bit more RAM and an investigation of the HD speed.

    If anyone knows of some easy to read (manager-friendly), short (again manager-friendly) papers that could be of assistance please let me know.

  • Your real problem is not being able to pin data in memory. It’s the need to push all the data in the database to the client as fast as possible.

    For that, you should do the following, within your cost constraints, of course:

    Make sure you have enough memory available to SQL Server to hold the entire database in memory.

    Put the fastest possible processors on the server.

    Avoid using a VM if possible.

    Use SQL Server 2005 64 bit on Windows 64 bit to allow it to push large amounts of data around at max speed.

    If you use Windows 32 bit, you will need to install Windows 2003 Enterprise Edition to be able to use more than 4 GB of RAM.

    Make sure your network connection is at least gigabit speed.

    Although you cannot absolutely force data into memory, you can do things that ensure the vast majority is there, like run queries against all tables that require table scans. Probably running a full backup just before your morning application restart would be the simplest method to get the entire database into memory. You can monitor the situation with performance monitor to see how many database page reads you get during the load; if it stays at or near zero, you are good.

    I also have concerns about this application design. Although you have a controlled set of requirements, things have a way of changing, so good luck:

    Boss: The company that just bought us loves your application and wants to start using it. The new data will total around 300 GB. How many days will it take you to get this going?

    You: But, you promised the application would never have to deal with that volume of data!

    Boss: Guess I was wrong. By the way, I told new CIO you would have this live by the end of next week.

  • Boss: The company that just bought us loves your application and wants to start using it. The new data will total around 300 GB. How many days will it take you to get this going?

    You: But, you promised the application would never have to deal with that volume of data!

    Boss: Guess I was wrong. By the way, I told new CIO you would have this live by the end of next week.

    :D:D:D LOL :D:D:D

    And let's add:

    Boss: By the way, if you can't do it you may get ME in trouble. No pressure πŸ˜€


    * Noel

Viewing 15 posts - 16 through 30 (of 41 total)

You must be logged in to reply to this topic. Login to reply