June 4, 2018 at 10:56 am
SQL is not my primary language and I could use some insights into how to accomplish a task I need to perform on multiple separate database systems.
We use a third party accounting program that interfaces with SharePoint, allowing our clients to submit invoices through their website and submitting to their system for processing, which is done outside of SharePoint. For various reasons we end up with items in SharePoint that do not have a correlating record in our accounting back end DB or we have duplicates in SharePoint. I'd like to find the Duplicates and the orphans in a process that would be run perhaps quarterly.SharePoint Databases are set up 1 per client and we have about 200 Clients. These clients also have corresponding data in the remote database, which a multiple clients share, distinguished by a client ID. I say 'remote' but these DBs are on our domain and controlled by us but they are just not linked nor will they ever be due to security requirements.
I've been using PoweShell to do the work so far. I need to go through each client SharePoint site and get all invoice IDs from a list (this part is simple) and then perform a lookup on each one in the remote database to insure there is a record there and if not then make a note of it and if it is in fact orphaned in SharePoint, remove the SharePoint record or if a duplicate make a note and remove the duplicate, which is also a manual process.
I have a PowerShell script that I am using which works well for sites that have 20 or so invoices. It runs on a SharePoint app server which also has SQL installed on it (for SSRS) and therefore has the SQL PowerShell objects needed for the queries. I get all the invoice IDs from SharePoint then run an IN statement on the remote database and get a result set back which I persist to a text file. I review them, manually at the moment since the process is still early in development, to look for IDs that are missing on the remote DB or dupes in SP. Problem is some clients have several thousand invoices that will not do well using an IN statement and since these are production databases I don't really want to be doing an unnecessarily big query like that for fear of locking rows or worse.
So, at the root of this seems to be the problem of looking up data on a remote, non-linked SQL Server database. I don't really want to do a query for each invoice since that could mean several thousand hits per client, though that may be the best method in the end. I also want to avoid creating a physical table in the remote databases. i would not get permission to do so. I could always just export the data in the remote DB and put them in a database but with 200 clients and growing that doesn't seem the best option and could take a lot of my time.
Curious to hear ideas from experts on how this might be accomplished.
Thank you!
Even as a mother protects with her life
Her child, her only child,
So with a boundless heart
Should one cherish all living beings;
June 4, 2018 at 11:26 am
One option might be to populate a global temp table (name starts with ## instead of just #) with the ID values you are getting, and then LEFT OUTER JOIN to that table to see which rows of your source data don't have a matching ID value in that temp table. It's still a long query with 1,000's of INSERTs, but at least the detection mechanism is then fairly easy. Since the temp table only needs to be one column wide, it shouldn't be all that large....
Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)
June 4, 2018 at 11:29 am
pseudo code
read sharepoint list - load onto datatable Invoices
using dbconnection to sqldb -- using transactionscope
{
create temp table (#invoices) -- execute nonquery
load datatable invoices onto #invoices -- using sqlbulkcopy -- this makes it very fast
execute required queries - join with #invoices -- execute query (Note 1)
}
Note1:
This step is where you would do required joint to identify dups/missing/new and act upon them as needed - it could mean multiple queries and multiple actions on each. Or a big query with multiple steps on it.
As at the moment you are doing all review manually the query above would just be a standard select join with #invoices |export-csv style
--
Note that I would not use the standard SQL powershell objects but rather the standard data access modules which do not require SQL to be installed, just the MDAC
June 4, 2018 at 12:14 pm
frederico_fonseca - Monday, June 4, 2018 11:29 AMpseudo coderead sharepoint list - load onto datatable Invoices
using dbconnection to sqldb -- using transactionscope
{
create temp table (#invoices) -- execute nonquery
load datatable invoices onto #invoices -- using sqlbulkcopy -- this makes it very fast
execute required queries - join with #invoices -- execute query (Note 1)
}Note1:
This step is where you would do required joint to identify dups/missing/new and act upon them as needed - it could mean multiple queries and multiple actions on each. Or a big query with multiple steps on it.As at the moment you are doing all review manually the query above would just be a standard select join with #invoices |export-csv style
--Note that I would not use the standard SQL powershell objects but rather the standard data access modules which do not require SQL to be installed, just the MDAC
Thanks for this. I'm not familiar with the MDAC so I'll give that a look. I will give this method a try. I appreciate your effort on this!
Even as a mother protects with her life
Her child, her only child,
So with a boundless heart
Should one cherish all living beings;
June 5, 2018 at 11:55 pm
This was removed by the editor as SPAM
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply