Kitchen Nightmares: LangChain agents edition

TLDR?

Health application to parse recipes, create meals and meal plans
Implemented agents using LangChain and LangSmith for agent observability, FastAPI and React for web app
How using LangSmith can help diagnose agent bugs
I fired the idiot sandwich agent

Who doesn't love a good home cooked meal?

Over the last couple of years, cooking has been something I've grown to love. I enjoy cooking great food for my fiancé and finding recipes that I can share with my family and friends. The hard and annoying part sometimes is sharing that recipe, either I can't find the link or I changed something I did in the recipe and I've forgotten. Maybe its a memory problem instead…

Oh well, I'm an engineer so the only solution I know of is to buy a domain, build a side project and never ship it.

At first I created a script that will fetch the recipe details and save them in a Postgres database. It solved the problem of finding these recipes again, it was just a select query to get the information, but another problem I had was handling different "recipe components". Some recipes included multiple dishes like a main, side, sauces etc. The parsed recipe contents would provide a list of the ingredients and instructions for ALL the components. There were times where I only made the main dish of one recipe, and the side dish of another so this automation still didn't solve the problem, that's where I had an idea that I GUARANTEE nobody else has had.

What if I throw AI at this?

Genius right? Claude thought so too.

The idea is a FastAPI/React project that uses LangChain for agent invocation, and LangSmith for observability. The backend programmatically parses a recipe, then passes that in a prompt to a grouping agent which will find the different recipe components such as main, side, veggie, sauce etc and will group those components with their respective ingredients and instructions. There's a second structuring agent that will take the output from the grouping and create a structured response sent to the UI. I can then create a meal and/or meal plan using those components, calculate calorie information, mix and match components to create different meals and share the components or meal plan with others. Example of the UI shown below. It includes the inconsistent output. Notice the main dish doesn't have instructions!

The idiot sandwich agents

I am using 2 agents for the recipe extraction. One agent is responsible for grouping and generating the components, the other is responsible for just providing the structured output from the grouping agent. These are the bones of the dual agent execution:

def execute(url: str) -> ParsedRecipeComponentResponse:
    tracker = AgentMetadataTracker()

    grouping_prompt = (
    )
    grouping_agent = create_agent(model="claude-haiku-4-5", tools=[recipe_fetch])
    grouping_result = grouping_agent.invoke(
        {
            "messages": [
                {
                    "role": "system",
                    "content": grouping_prompt,
                },
                {"role": "user", "content": url},
            ]
        },
        config={"callbacks": [tracker]},
    )

    structuring_agent = create_agent(
        model="claude-haiku-4-5",
        tools=[],
        response_format=ParsedRecipeComponentResponse,
    )
    structuring_prompt = (

    )
    result = structuring_agent.invoke(
        {
            "messages": [
                {
                    "role": "system",
                    "content": structuring_prompt,
                },
                {"role": "user", "content": grouping_result["messages"][-1].content},
            ]
        },
        config={"callbacks": [tracker]},
    )

    sr = result["structured_response"]
    return sr

I already know what you're thinking. Why two agents? At first, when I tried to use one agent to do the grouping and produce the structured output, it was very unreliable. Sometimes the agent would generate duplicate components, two main dishes with one having all the ingredients and the other all the instructions. Sometimes the agent would forget the instructions in the output all together (shown in picture above), and other times it would just combine the entire recipe into one component.

I then split it into two focused agents, one to group the recipes and one to just create the structured output. The funny thing is that this worked perfectly for the first day. After splitting the agent execution it worked really well, but when I went to implement the API call into the UI the NEXT DAY, the inconsistencies started showing up again. I was using print statements to try and debug the issues, which got quite annoying and that's when I learned about LangSmith. It's an observability platform that is agentic framework agnostic but integrates really easily with LangChain agent invocation.
Using LangSmith, I am able to track the token cost, input/output per agent execution. Which is perfect because I created a "tracker" shown in the code above to try and monitor these metrics already, and its no longer needed.

In the LangSmith tracing, it shows the latency, tokens and even cost! This will help when I try to optimize agent execution, can't improve what you don't measure. From the agent input/output I was able to assess that the grouping agent was working as expected. I saw the exact markdown output with the instructions and ingredients for each component, example shown below parsed from https://nickskitchen.com/halal-chicken-recipe/:

Based on the recipe I retrieved, here are the distinct components of the Halal Chicken and Rice recipe:

## MAIN DISH: Marinated Halal Chicken

### Ingredients:
- ½ cup plain whole-milk yogurt
- 3 tablespoon neutral oil (canola or avocado)
- 1 tablespoon white vinegar or lemon juice
- 6 clove garlic (finely grated)
- 2 teaspoon ground cumin
- 2 teaspoon ground coriander
- 2 teaspoon dried oregano
- 1½ teaspoon smoked paprika
- 1 teaspoon black pepper
- 2 teaspoon kosher salt
- ½ teaspoon ground turmeric
- ½ teaspoon ground ginger
- 2 pound boneless, skinless chicken thigh

### Instructions:
1. In a large bowl, whisk together the yogurt, oil, vinegar or lemon juice, garlic, spices, and salt until smooth.
2. Add the chicken and toss to coat completely.
3. Cover and refrigerate for at least two hours, preferably overnight.
4. Heat a cast-iron skillet or grill over high heat.
5. Cook the chicken in batches for five to seven minutes per side until deeply browned with slight charring. Cook until the internal temperature reaches 165°F (74°C).
6. Transfer to a cutting board and rest for 10-15 minutes.

---

## SIDES: Yellow Rice

### Ingredients:
- 2 cup long-grain white rice
- 2 cup chicken stock
- 1 tablespoon butter
- ½ teaspoon ground turmeric
- 1 teaspoon cumin seeds
- 1 bay leaf
- ¼ teaspoon garlic powder (optional)
- 1 teaspoon kosher salt

### Instructions:
1. Rinse the rice in a fine mesh strainer until the water runs mostly clear.
2. Add the rice, chicken stock, butter, turmeric, cumin seeds, bay leaf, garlic powder, and salt to a rice cooker or pot.
3. Cook using the white rice setting or simmer covered until tender.
4. Fluff with a fork and keep warm.

---

## SAUCES: White Sauce (Garlic Yogurt)

### Ingredients:
- 1 cup mayonnaise
- ½ cup plain Greek yogurt
- 2 tablespoon lemon juice
- 2 tablespoon white vinegar
- 4 clove garlic (finely grated)
- 1 tablespoon sugar
- 1 teaspoon black pepper
- 1 teaspoon salt
- 1 teaspoon dried dill
- 1 teaspoon dried tarragon

### Instructions:
1. Whisk all ingredients together in a bowl until smooth.
2. Add a little cold water if needed to thin to a pourable consistency.
3. Transfer to a squeeze bottle (optional) and refrigerate for at least two hours, preferably overnight.

---

## SAUCES: Red Sauce (Hot Chile)

### Ingredients:
- 6-8 dried red chiles (chile de árbol or similar)
- 4 clove garlic
- 1 tablespoon white vinegar
- 1½ teaspoon paprika
- ½-1 teaspoon cayenne pepper (to taste)
- 1 teaspoon kosher salt
- ½ cup neutral oil

### Instructions:
1. Soak the dried chiles in boiling water for ten minutes.
2. Add softened chiles, garlic, vinegar, paprika, cayenne, and salt to a blender and blend until smooth.
3. With the blender running, slowly stream in the oil to emulsify.
4. Add soaking liquid as needed to thin.
5. Transfer to a squeeze bottle and chill for at least two hours.

---

## SALADS: Lettuce and Tomato Salad

### Ingredients:
- iceberg lettuce (finely shredded)
- ripe tomatoes (thinly sliced)

### Instructions:
1. Finely shred the lettuce and thinly slice the tomatoes.
2. Refrigerate until ready to serve to keep crisp.

---

## BREAD: Pita Bread

### Ingredients:
- pita bread (warmed)

### Instructions:
1. Warm the pita in a dry pan or oven until soft and lightly toasted.

This is EXACTLY what I expect from the grouping agent and the structure agent receives this input too, but somehow doesn't consistently produce the correct output. Here is the output from the structuring agent with the above input:

{"components":[{"name":"Marinated Halal Chicken","cuisine":"Middle Eastern","source":"https://nickskitchen.com/halal-chicken-recipe/`, "},{"name":"Yellow Rice","cuisine":"Middle Eastern","source":"https://nickskitchen.com/halal-chicken-recipe/`, "},{"name":"White Sauce (Garlic Yogurt)","cuisine":"Middle Eastern","source":"https://nickskitchen.com/halal-chicken-recipe/`, "},{"name":"Red Sauce (Hot Chile)","cuisine":"Middle Eastern","source":"https://nickskitchen.com/halal-chicken-recipe/`, "},{"name":"Lettuce and Tomato Salad","cuisine":"Middle Eastern","source":"https://nickskitchen.com/halal-chicken-recipe/`, "},{"name":"Pita Bread","cuisine":"Middle Eastern","source":"https://nickskitchen.com/halal-chicken-recipe/"}]}

In this example the instructions and ingredients are COMPLETELY missing, when this is the expected structured response:

class RecipeComponent(BaseModel):
    """Parsed Recipe Information"""

    name: str = Field(description="Name of the recipe component")
    cuisine: str = Field(
        description=(
            "Cuisine type inferred from the recipe "
            "(e.g. 'Indian', 'Mexican', 'Italian'). "
            "Use 'Unknown' if it cannot be determined."
        )
    )
    instructions: List[str] = Field(
        default_factory=list,
        description="List of instructions for this recipe component",
    )
    ingredients: List[str] = Field(
        default_factory=list,
        description="List of ingredients for this recipe component",
    )
    source: str = Field(description="Source of where this recipe component was parsed")

class ParsedRecipeComponentResponse(BaseModel):
    components: List[RecipeComponent] = Field(
        description="List of components parsed from the recipe"
    )

I have to fix these idiot sandwiches now, and it's definitely my fault. Thankfully with LangSmith I can help identify which agent is causing the issue, and whether I even need 2 agents at all.

What's next

First I want to fix the agent reliability - I noticed that the structured output contains a cuisine field that isn't part of my original prompt for the grouping agent, and isn't included in the markdown output. I wonder if this causes the structuring agent to become an idiot sandwich.
Adding LLM validation - I'm familiar with the LLM-as-a-judge pattern for output verification, but I feel like what would be overkill in this situation. I should be able to create a python function to validate the outputs, ensure the markdown is in the correct format and that the structured response does not contain duplicates etc.
Improving Observability - I want a metric that links all agent executions to a parsing event. When a user triggers some parsing event, I want to be able to trace every aspect of that request.