More fun with Recursive CTEs

  • Instead of dealing with courses this time, I'm trying to sort out how to model a recipe problem. (It's bread, so it's not crazy complex.)  Oh wait, here... have some table definitions:

    USE [Bakery]
    GO
    /* Sorry, this started in Abscess */
    CREATE TABLE [dbo].[Ingredient](
    [IngredientID] [int] IDENTITY(1,1) NOT NULL,
    [IngredientName] [varchar](255) NOT NULL,
    [UnitPrice] [money] NULL,
    [IngredientWeight] [real] NULL,
    [Volume] [real] NULL,
    [IsFlour] [bit] NULL,
    [VendorID] [int] NULL
    ) ON [PRIMARY]
    GO
    USE [Bakery]
    GO

    CREATE TABLE [dbo].[Recipe](
    [RecipeID] [int] IDENTITY(1,1) NOT NULL,
    [Recipe Name] [nvarchar](255) NULL,
    [Yield] [int] NULL,
    [Yield_Kg] [int] NULL,
    [SalePrice] [money] NULL
    CONSTRAINT pkRecipe PRIMARY KEY (RecipeID)
    ) ON [PRIMARY]
    GO

    CREATE TABLE [dbo].[RecipeIngredient](
    [RecipeID] [int] NOT NULL,
    [IngredientID] [int] NOT NULL,
    [RecipeIngredientWeight] [real]
    CONSTRAINT fkIngredientID (IngredientID) REFERENCES Ingredient(IngredientID),
    CONSTRAINT fkRecipeID (RecipeID) REFERENCES Recipe(RecipeID)
    ) ON [PRIMARY]
    GO

    For the most part, recipes are really simple (RecipeID, IngredientID, Weight).

    The one exception is Starter (instead of using commercial yeast, he basically grows his own. Right now, I'm only aware of two Starters, Poolish (50% water, 50% flour by weight), and another is Sourdough Starter

    So given a recipe that requires (for example {2kg flour, 1 kg sourdough starter} I want to expand that to {Water, Potato Flakes, Sugar, NW Flour (just multiply the total weight (1 kg) by each percentage).

    What's the cleanest/most flexible way of doing this? Sounds a bit like a recursive query... "Starters" are kinda like "assemblies" in the bike example.. I was thinking of adding some Starters to the Recipes table, and then just refer to that (and specify a recipe weight)

    (Yeah, in theory, I should record all the recipe amounts as share vs total flour weight, but let's not get too crazy).

    There is no recipe that does not include some kind of starter (either Poolish or Sourdough starter).

    Okay, end goal: I want to be able to price all this stuff easily... (in a data warehouse eventually).

    Does it make sense to create a recursive CTE to get the final ingredient cost? (because the CTE will multiply "ingredient weight" by the percentages of ingredient for Starter and then just union that with all the non-starter ingredients and then I can multiply ingredient weight * ingredient cost, and sum all that to get the cost of a batch (and then divide by yield to determine unit cost). My big question is whether using a CTE for this is overkill or the right way to do it?  If I use the recursive CTE, I'd want to return just the "final" items. (as Ed (I think) pointed out... those are the ingredients with no "subingredients"/no child records... (gotcha).

    The recipes are static... it's the ingredient prices that fluctuate, and I'm just trying to determine item unit cost over time. (Yeah, there's other stuff to add in, like overhead, but that's easy).

    Maybe be back to edit this tomorrow in case this makes no sense. I'm just trying to figure my way through the last part I'm not sure about. (Then I'm going to create a bunch of ingredient purchase data, add in some sales, and model the effect of ingredient prices on gross and net profit.)

    No need to answer a lot here. Just point out the parts that are in the right/wrong direction. (So if you add more than 50 words, you've probably worked too hard on the answer!)

    Thanks!

    Pieter

  • One line makes me shudder:

    There is no recipe that does not include some kind of starter (either Poolish or Sourdough starter).

    So does Poolish (a recipe) ALSO have a starter? If so, you will have infinite loops if you go with recursion.

    But lets assume that Poolish and Sourdough are the "base" and will never have a starter. How many loops can you have? Is it ALWAYS going to be 1 recipe + 1 starter OR is there going to be a case where recipe A uses B which uses C which uses Starter? If it is JUST 1, then this can probably be done without any recursive CTE's. IF the number of loops is unknown, then you'll need some recursion. Well, not "need" but I would recommend it.

    Just my 2 cents.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • So does Poolish (a recipe) ALSO have a starter?

    No. Poolish / Starter is what the yeast grows in.  Not that it matters that much, but Poolish is 50% flour, 50% water (might be by volume), The other Starter is {Water 40%, Potato Flakes 8%, Sugar 32%, White flour 21% by weight}. If they do, it's minimal. You do add a little of the old starter to get the new one going, so it's not 100% exact, but I think that's just a limitation of the model.

    So if I take the amount of Starter/Poolish, and then multiply by the percentages, I get a list of weights. I think I could use a table valued function for this. Just retrieve the most recent prices for each ingredient, multiply by percentage and then by the goal weight. The reason I thought it would be recursive is that they're both "intermediate" ingredients (that consist of other ingredients).

    My thinking was something like:

    Take the entire recipe and calculate the amount of ingredient for everything except the Starters.

    Then use the percentages for the given Starter to return a mass of starter of the appropriate weight. (Just multiply... nice and easy). Then I basically union the scaled Starter recipe and the loaf recipe, and sum all that and I end up with a total weight of everything. Then just join that to the latest ingredient price list (relative to manufacture date), and I'll have the unit cost (well, the ingredient part).

  • A data model to store the recursive relationships and to resolve them with a common table expression could include tables: ingredients, recipes, recipe_inputs, recipe_input_ingredients, recipe_input_recipes.  With recipe_input_recipes you could store the recursive relationship back to a collection of ingredients without creating an intermediate concept of an "assemblage" or whatever

     

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Oh I get it! (duh!)

    split the recipe into two parts in the CTE.

    Part 1 is all the ingredients, weights for everything except the starter

    UNION ALL

    (For each ingredient in the Starter, the % by weight for each one multiplied by the original starter weight)

    Then SUM the weights, grouping by the Ingredient.

    Some daze, I swear...

    Thanks for the pointer!

  • Oh I get it! (duh!)

    split the recipe into two parts in the CTE.

    Part 1 is all the ingredients, weights for everything except the starter

    UNION ALL

    (For each ingredient in the Starter, the % by weight for each one multiplied by the original starter weight)

    Then SUM the weights, grouping by the Ingredient.

    Some daze, I swear...

    Thanks for the pointer!

  • pietlinden wrote:

    split the recipe into two parts in the CTE. Part 1 is all the ingredients, weights for everything except the starter UNION ALL (For each ingredient in the Starter, the % by weight for each one multiplied by the original starter weight)

    Then SUM the weights, grouping by the Ingredient.

    Inputs could be either ingredient(s) or recipe(s), no?  If I'm understanding the intention you're looking for a single query which resolves any recipe into its component ingredients including recipes which reference other recipes.

    Regarding the initial code above, too many brackets and questionable data types imo.

    /* Sorry, this started in Abscess */
    CREATE TABLE dbo.Ingredients(
    IngredientID int IDENTITY(1,1) NOT NULL,
    IngredientName varchar(255) NOT NULL,
    UnitPrice money NULL, /* should be int or decimal. 'money' data type is for calculating not storing */
    IngredientWeight real NULL, /* why floating point? why not int? */
    Volume real NULL, /* why floating point? why not int? */
    IsFlour bit NULL, /* hunh? this shouldn't exist */
    VendorID int NULL);
    go

    CREATE TABLE dbo.Recipes(
    RecipeID int IDENTITY(1,1) NOT NULL,
    Recipe Name nvarchar(255) NULL,
    Yield int NULL,
    Yield_Kg int NULL,
    SalePrice money NULL
    CONSTRAINT pkRecipe PRIMARY KEY (RecipeID));
    go

    CREATE TABLE dbo.RecipeIngredients(
    RecipeID int NOT NULL,
    IngredientID int NOT NULL,
    RecipeIngredientWeight real
    CONSTRAINT fkIngredientID (IngredientID) REFERENCES Ingredient(IngredientID),
    CONSTRAINT fkRecipeID (RecipeID) REFERENCES Recipe(RecipeID));
    go

    A convenient script of test tables could look something like this

    drop TABLE if exists dbo.testRecipeInputIngredients;
    drop TABLE if exists dbo.testRecipeInputRecipes;
    drop TABLE if exists dbo.testRecipeInputs;
    drop TABLE if exists dbo.testRecipes;
    drop TABLE if exists dbo.testIngredients;
    go

    drop TABLE if exists dbo.testIngredients;
    go
    CREATE TABLE dbo.testIngredients(
    IngredientID int IDENTITY(1,1) primary key NOT NULL,
    IngredientName varchar(255) NOT NULL,
    UnitPrice int NULL,
    IngredientWeight int NULL,
    Volume int NULL,
    VendorID int NULL);
    go

    drop TABLE if exists dbo.testRecipes;
    go
    CREATE TABLE dbo.testRecipes(
    RecipeID int IDENTITY(1,1) primary key NOT NULL,
    RecipeName varchar(255) NULL,
    Yield int NULL,
    Yield_Kg int NULL,
    SalePrice int NULL);
    go

    insert dbo.testRecipes(RecipeName, Yield, Yield_Kg, SalePrice) values
    ('First recipe cannot reference another recipe becuse circular', 10, 1, 1299),
    ('Second recipe does not reference another recipe', 20, 4, 199),
    ('Third recipe does not reference another recipe', 40, 12, 399),
    ('4th recipe does reference another recipe', 16, 12, 399),
    ('5th recipe references 4th recipe', 54, 10, 2349),
    ('6th recipe references 5th recipe', 24, 4, 1399);

    drop TABLE if exists dbo.testRecipeInputs;
    go
    CREATE TABLE dbo.testRecipeInputs(
    RecipeInputID int IDENTITY(1,1) primary key NOT NULL,
    RecipeID int NOT NULL,
    InputName varchar(255) NULL, /* optional */
    UnitPrice int NULL,
    InputWeight int NULL,
    Volume int NULL,
    VendorID int NULL,
    CONSTRAINT fktestRecipeInputsRecipeID foreign key (RecipeID) REFERENCES dbo.testRecipes(RecipeID));

    drop TABLE if exists dbo.testRecipeInputIngredients;
    go
    CREATE TABLE dbo.testRecipeInputIngredients(
    RiiID int IDENTITY(1,1) primary key NOT NULL,
    RecipeInputID int NOT NULL,
    IngredientID int NOT NULL,
    CONSTRAINT fktestRecipeInputIngredientsRecipeInputID foreign key (RecipeInputID) REFERENCES testRecipeInputs(RecipeInputID),
    CONSTRAINT fktestRecipeInputIngredientsIngredientID foreign key (IngredientID) REFERENCES testIngredients(IngredientID));

    drop TABLE if exists dbo.testRecipeInputRecipes;
    go
    CREATE TABLE dbo.testRecipeInputRecipes(
    RiiID int IDENTITY(1,1) primary key NOT NULL,
    RecipeInputID int NOT NULL,
    RecipeID int NOT NULL,
    CONSTRAINT fktestRecipeInputRecipesRecipeInputID foreign key (RecipeInputID) REFERENCES testRecipeInputs(RecipeInputID),
    CONSTRAINT fktestRecipeInputRecipesRecipeID foreign key (RecipeID) REFERENCES testRecipes(RecipeID));

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Think I'm gonna bite the bullet and rebuild with sane field types etc and see what happens. =)

    But to answer your question, yes - the goal is to take a recipe and return (1) the total weight of each ingredient (so break down the starter into its component parts), and then multiply and get a total cost for a loaf. The ultimate goal is to understand ingredient costs and how each affects profit.

     

    • This reply was modified 6 months, 3 weeks ago by  pietlinden.
  • Okay, I got the "explode Starter ingredients" bit working, so now it's getting easier!

    CREATE VIEW [dbo].[vwBaseIngredientRecipe]
    AS
    WITH cteRecipe(RecipeID, RecipeName, IngredientID, IngredientName, RecipeWeight)
    AS (
    SELECT
    r.RecipeID
    ,r.[Recipe Name]
    ,i.IngredientID
    ,i.IngredientName
    ,ExpandedRI.RecipeIngredientWeight

    FROM
    /* non-Starter ingredients */
    (SELECT ri.RecipeID
    ,ri.IngredientID
    --,i.IngredientName
    ,ri.RecipeIngredientWeight
    FROM RecipeIngredient ri
    INNER JOIN Ingredient i ON ri.IngredientID = i.IngredientID
    WHERE /* ri.RecipeID = @RecipeID
    AND */ i.IngredientName NOT LIKE '%Starter%'

    UNION ALL

    /* Ingredients for Starter part of recipe */
    SELECT ri.RecipeID
    /* wrong ingredients? because not starter? */
    --,ri.IngredientID
    --,i.IngredientName
    --,ri.RecipeIngredientWeight
    ,sr.IngredientID
    --,i.IngredientName
    ,RecipeIngredientWeight = ri.RecipeIngredientWeight * sr.WeightGrams/(1.000 * SUM(sr.WeightGrams) OVER (PARTITION BY ri.RecipeID))
    FROM RecipeIngredient ri
    INNER JOIN Ingredient i ON ri.IngredientID = i.IngredientID
    INNER JOIN StarterRecipe sr ON i.IngredientID = sr.ParentIngredientID
    WHERE /* ri.RecipeID = @RecipeID
    AND */ i.IngredientName LIKE '%Starter%') ExpandedRI
    INNER JOIN Recipe r
    ON ExpandedRI.RecipeID = r.RecipeID
    INNER JOIN Ingredient i
    ON ExpandedRI.IngredientID = i.IngredientID
    )
    SELECT
    cte.RecipeID
    ,cte.RecipeName
    ,cte.IngredientID
    ,cte.IngredientName
    ,SingleLoafIngredientWeightGrams = cte.RecipeWeight/(1.000 * r.Yield)
    ,r.Yield
    FROM cteRecipe cte
    INNER JOIN Recipe r
    ON r.RecipeID = cte.RecipeID;
    GO

    I think I still have to aggregate on Ingredient table columns, but that bit is trivial. Basically, split the two "sections" of the recipe so there's one "table" for Starter and another for the rest of the ingredients. Then join StarterID to the recipe for said Starter, then union that all together, then aggregate it. But the part that was getting me was how to expand everything... so now on to the DW part.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply