Research Assistant¶

Review¶

아래와 같이 LangGraph 의 주요 컨셉에 대해서 배웠습니다.

Memory
Human-in-the-loop
Controllability

이제 3가지를 합쳐서, 가장 유명한 AI 사용 법인 "research automation"을 구현해 보겠습니다

리서치는 애널리스트 입장에서 노동집약적인 일입니다. AI 로 이를 도와줘 보겠습니다.
리서치 작업을 위해 해결해야할 문제가 있습니다. 보통 LLM 의 output은 우리의 현실 문제와 align 이 잘 안되어 있는 경우가 많죠. 뻔한 소리를 한다거나, 자료의 출처를 정확하게 잘 찾고 대답하지 못 한다거나...

research and report generation 문서를 참조하면, 꽤 좋은 workflow 예시가 있습니다. 이 방법론을 LangGraph를 이용해서 구현해 보겠습니다.

Goal¶

목표는 다음과 같습니다.
리서치 작업을 해주는 가벼운 multi-agent 시스템을 구현하는 것 입니다.

Source Selection

유저는 인풋 소스를 골라줍니다.

Planning

사용자는 토픽을 지정해주고, 시스템은 AI 애널리스트 "팀" 을 만듧니다. 애널리스트들은 서브토픽들을 맡아서 분석합니다.
Human-in-the-loop 단계에서 서브토픽을 사람이 지정해 줄 예정입니다. 일을 쪼개는 것은 사람이 직접 하는 것이 효과적이군요.

LLM Utilization

각각의 애널리스트들은 또 다른 전문가 AI 를 in-depth 인터뷰 합니다. 이때 선택된 소스를 활용합니다.
인터뷰는 multi-turn conversation 입니다. detailed insights 를 뽑아 내기 위함입니다. STORM 페이퍼를 참조하시면 도움이 될 것입니다.
멀티턴 대화로 이뤄진 인터뷰는 sub-graphs 가 될 것입니다. 각 인터뷰가 state를 따로 가지고 있어야겠죠.

Research Process

전문가들은 애널리스트의 질문에 답하기 위해 정보를 수집할텐데, 이는 parallel 하게 이뤄집니다.
모든 인터뷰들은 동시에 일어나고 합쳐지겠죠. map-reduce 죠.

Output Format

각 인터뷰에서 취합된 인사이트들은 최종 보고서로 가공됩니다.
Output 포맷을 잘 맞춰주기 위해서 프롬프팅을 해줍니다.

In [1]:

Copied!

%%capture --no-stderr
%pip install --quiet -U langgraph langchain_openai langchain_community langchain_core tavily-python wikipedia
%%capture --no-stderr
%pip install --quiet -U langgraph langchain_openai langchain_community langchain_core tavily-python wikipedia

Setup¶

In [2]:

Copied!

import os, getpass

def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"{var}: ")

_set_env("OPENAI_API_KEY")
import os, getpass

def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"{var}: ")

_set_env("OPENAI_API_KEY")

OPENAI_API_KEY: ··········

In [3]:

Copied!

from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o", temperature=0)
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o", temperature=0)

In [4]:

Copied!

_set_env("LANGCHAIN_API_KEY")
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = "langchain-academy"
_set_env("LANGCHAIN_API_KEY")
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = "langchain-academy"

LANGCHAIN_API_KEY: ··········

Generate Analysts: Human-In-The-Loop¶

애널리스트들을 생성하고, Human in the loop 으로 멈춥니다.
직접 생성된 애널리스트들을 리뷰합니다.

In [5]:

Copied!





from typing import List
from typing_extensions import TypedDict
from pydantic import BaseModel, Field

class Analyst(BaseModel):
    affiliation: str = Field(
        description="Primary affiliation of the analyst.",
    )
    name: str = Field(
        description="Name of the analyst."
    )
    role: str = Field(
        description="Role of the analyst in the context of the topic.",
    )
    description: str = Field(
        description="Description of the analyst focus, concerns, and motives.",
    )
    @property
    def persona(self) -> str:
        return f"Name: {self.name}\nRole: {self.role}\nAffiliation: {self.affiliation}\nDescription: {self.description}\n"

class Perspectives(BaseModel):
    analysts: List[Analyst] = Field(
        description="Comprehensive list of analysts with their roles and affiliations.",
    )

class GenerateAnalystsState(TypedDict):
    topic: str # Research topic
    max_analysts: int # Number of analysts
    human_analyst_feedback: str # Human feedback
    analysts: List[Analyst] # Analyst asking questions
from typing import List
from typing_extensions import TypedDict
from pydantic import BaseModel, Field

class Analyst(BaseModel):
    affiliation: str = Field(
        description="Primary affiliation of the analyst.",
    )
    name: str = Field(
        description="Name of the analyst."
    )
    role: str = Field(
        description="Role of the analyst in the context of the topic.",
    )
    description: str = Field(
        description="Description of the analyst focus, concerns, and motives.",
    )
    @property
    def persona(self) -> str:
        return f"Name: {self.name}\nRole: {self.role}\nAffiliation: {self.affiliation}\nDescription: {self.description}\n"

class Perspectives(BaseModel):
    analysts: List[Analyst] = Field(
        description="Comprehensive list of analysts with their roles and affiliations.",
    )

class GenerateAnalystsState(TypedDict):
    topic: str # Research topic
    max_analysts: int # Number of analysts
    human_analyst_feedback: str # Human feedback
    analysts: List[Analyst] # Analyst asking questions

In [6]:

Copied!





from IPython.display import Image, display
from langgraph.graph import START, END, StateGraph
from langgraph.checkpoint.memory import MemorySaver
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage

analyst_instructions="""You are tasked with creating a set of AI analyst personas. Follow these instructions carefully:

1. First, review the research topic:
{topic}

2. Examine any editorial feedback that has been optionally provided to guide creation of the analysts:

{human_analyst_feedback}

3. Determine the most interesting themes based upon documents and / or feedback above.

4. Pick the top {max_analysts} themes.

5. Assign one analyst to each theme."""

def create_analysts(state: GenerateAnalystsState):

    """ Create analysts """

    topic=state['topic']
    max_analysts=state['max_analysts']
    human_analyst_feedback=state.get('human_analyst_feedback', '')

    # Enforce structured output
    structured_llm = llm.with_structured_output(Perspectives)

    # System message
    system_message = analyst_instructions.format(topic=topic,
                                                            human_analyst_feedback=human_analyst_feedback,
                                                            max_analysts=max_analysts)

    # Generate question
    analysts = structured_llm.invoke([SystemMessage(content=system_message)]+[HumanMessage(content="Generate the set of analysts.")])

    # Write the list of analysis to state
    return {"analysts": analysts.analysts}

def human_feedback(state: GenerateAnalystsState):
    """ No-op node that should be interrupted on """
    pass

def should_continue(state: GenerateAnalystsState):
    """ Return the next node to execute """

    # Check if human feedback
    human_analyst_feedback=state.get('human_analyst_feedback', None)
    if human_analyst_feedback:
        return "create_analysts"

    # Otherwise end
    return END

# Add nodes and edges
builder = StateGraph(GenerateAnalystsState)
builder.add_node("create_analysts", create_analysts)
builder.add_node("human_feedback", human_feedback)
builder.add_edge(START, "create_analysts")
builder.add_edge("create_analysts", "human_feedback")
builder.add_conditional_edges("human_feedback", should_continue, ["create_analysts", END])

# Compile
memory = MemorySaver()
graph = builder.compile(interrupt_before=['human_feedback'], checkpointer=memory)

# View
display(Image(graph.get_graph(xray=1).draw_mermaid_png()))
from IPython.display import Image, display
from langgraph.graph import START, END, StateGraph
from langgraph.checkpoint.memory import MemorySaver
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage

analyst_instructions="""You are tasked with creating a set of AI analyst personas. Follow these instructions carefully:

1. First, review the research topic:
{topic}

2. Examine any editorial feedback that has been optionally provided to guide creation of the analysts:

{human_analyst_feedback}

3. Determine the most interesting themes based upon documents and / or feedback above.

4. Pick the top {max_analysts} themes.

5. Assign one analyst to each theme."""

def create_analysts(state: GenerateAnalystsState):

    """ Create analysts """

    topic=state['topic']
    max_analysts=state['max_analysts']
    human_analyst_feedback=state.get('human_analyst_feedback', '')

    # Enforce structured output
    structured_llm = llm.with_structured_output(Perspectives)

    # System message
    system_message = analyst_instructions.format(topic=topic,
                                                            human_analyst_feedback=human_analyst_feedback,
                                                            max_analysts=max_analysts)

    # Generate question
    analysts = structured_llm.invoke([SystemMessage(content=system_message)]+[HumanMessage(content="Generate the set of analysts.")])

    # Write the list of analysis to state
    return {"analysts": analysts.analysts}

def human_feedback(state: GenerateAnalystsState):
    """ No-op node that should be interrupted on """
    pass

def should_continue(state: GenerateAnalystsState):
    """ Return the next node to execute """

    # Check if human feedback
    human_analyst_feedback=state.get('human_analyst_feedback', None)
    if human_analyst_feedback:
        return "create_analysts"

    # Otherwise end
    return END

# Add nodes and edges
builder = StateGraph(GenerateAnalystsState)
builder.add_node("create_analysts", create_analysts)
builder.add_node("human_feedback", human_feedback)
builder.add_edge(START, "create_analysts")
builder.add_edge("create_analysts", "human_feedback")
builder.add_conditional_edges("human_feedback", should_continue, ["create_analysts", END])

# Compile
memory = MemorySaver()
graph = builder.compile(interrupt_before=['human_feedback'], checkpointer=memory)

# View
display(Image(graph.get_graph(xray=1).draw_mermaid_png()))

No description has been provided for this image

In [7]:

Copied!





# Input
max_analysts = 3
topic = "Llama3, Llama3.1, Llama3.2 모델들에 대한 분석"
thread = {"configurable": {"thread_id": "1"}}

# Run the graph until the first interruption
for event in graph.stream({"topic":topic,"max_analysts":max_analysts,}, thread, stream_mode="values"):
    # Review
    analysts = event.get('analysts', '')
    if analysts:
        for analyst in analysts:
            print(f"Name: {analyst.name}")
            print(f"Affiliation: {analyst.affiliation}")
            print(f"Role: {analyst.role}")
            print(f"Description: {analyst.description}")
            print("-" * 50)
# Input
max_analysts = 3
topic = "Llama3, Llama3.1, Llama3.2 모델들에 대한 분석"
thread = {"configurable": {"thread_id": "1"}}

# Run the graph until the first interruption
for event in graph.stream({"topic":topic,"max_analysts":max_analysts,}, thread, stream_mode="values"):
    # Review
    analysts = event.get('analysts', '')
    if analysts:
        for analyst in analysts:
            print(f"Name: {analyst.name}")
            print(f"Affiliation: {analyst.affiliation}")
            print(f"Role: {analyst.role}")
            print(f"Description: {analyst.description}")
            print("-" * 50)

Name: Dr. Emily Carter
Affiliation: OpenAI
Role: Model Performance Specialist
Description: Dr. Carter focuses on the performance metrics and efficiency of AI models. She is particularly interested in the computational efficiency and accuracy of Llama3 series models.
--------------------------------------------------
Name: Prof. Michael Zhang
Affiliation: Stanford University
Role: Ethics and Bias Researcher
Description: Prof. Zhang examines the ethical implications and potential biases in AI models. His work with the Llama3 series involves identifying and mitigating biases in the models' outputs.
--------------------------------------------------
Name: Dr. Sarah Lee
Affiliation: Google DeepMind
Role: Innovation and Applications Expert
Description: Dr. Lee explores innovative applications and real-world use cases of AI models. She is interested in how the Llama3 series can be applied in various industries and the potential for new technological advancements.
--------------------------------------------------

In [8]:

Copied!

# Get state and look at next node
state = graph.get_state(thread)
state.next
# Get state and look at next node
state = graph.get_state(thread)
state.next

Out[8]:

('human_feedback',)

In [9]:

Copied!

# We now update the state as if we are the human_feedback node
graph.update_state(thread, {"human_analyst_feedback":
                            "스타트업의 관점에서 Llama 모델들을 활용할 수 있는 방안에 대해 조사할 수 있는 사람을 추가해줘"}, as_node="human_feedback")
# We now update the state as if we are the human_feedback node
graph.update_state(thread, {"human_analyst_feedback":
                            "스타트업의 관점에서 Llama 모델들을 활용할 수 있는 방안에 대해 조사할 수 있는 사람을 추가해줘"}, as_node="human_feedback")

Out[9]:

{'configurable': {'thread_id': '1',
  'checkpoint_ns': '',
  'checkpoint_id': '1ef80803-4512-6d04-8002-c5b0b3792ba9'}}

In [10]:

Copied!





# Continue the graph execution
for event in graph.stream(None, thread, stream_mode="values"):
    # Review
    analysts = event.get('analysts', '')
    if analysts:
        for analyst in analysts:
            print(f"Name: {analyst.name}")
            print(f"Affiliation: {analyst.affiliation}")
            print(f"Role: {analyst.role}")
            print(f"Description: {analyst.description}")
            print("-" * 50)
# Continue the graph execution
for event in graph.stream(None, thread, stream_mode="values"):
    # Review
    analysts = event.get('analysts', '')
    if analysts:
        for analyst in analysts:
            print(f"Name: {analyst.name}")
            print(f"Affiliation: {analyst.affiliation}")
            print(f"Role: {analyst.role}")
            print(f"Description: {analyst.description}")
            print("-" * 50)

Name: Dr. Emily Carter
Affiliation: OpenAI
Role: Model Performance Specialist
Description: Dr. Carter focuses on the performance metrics and efficiency of AI models. She is particularly interested in the computational efficiency and accuracy of Llama3 series models.
--------------------------------------------------
Name: Prof. Michael Zhang
Affiliation: Stanford University
Role: Ethics and Bias Researcher
Description: Prof. Zhang examines the ethical implications and potential biases in AI models. His work with the Llama3 series involves identifying and mitigating biases in the models' outputs.
--------------------------------------------------
Name: Dr. Sarah Lee
Affiliation: Google DeepMind
Role: Innovation and Applications Expert
Description: Dr. Lee explores innovative applications and real-world use cases of AI models. She is interested in how the Llama3 series can be applied in various industries and the potential for new technological advancements.
--------------------------------------------------
Name: Dr. Min-Jae Kim
Affiliation: AI Research Lab
Role: Lead AI Researcher
Description: Dr. Kim focuses on the technical advancements and performance metrics of the Llama3 series models. His primary concern is to evaluate the models' capabilities, limitations, and potential improvements.
--------------------------------------------------
Name: Soo-Jin Park
Affiliation: Tech Startup Incubator
Role: Startup Strategy Consultant
Description: Soo-Jin specializes in advising startups on integrating advanced AI models like Llama3 into their business strategies. She explores practical applications, cost-benefit analyses, and market positioning.
--------------------------------------------------
Name: Professor Hyeon-Woo Lee
Affiliation: University of Seoul
Role: AI Ethics and Policy Expert
Description: Professor Lee examines the ethical implications and policy considerations of deploying Llama3 models. His focus includes data privacy, algorithmic bias, and regulatory compliance.
--------------------------------------------------

In [11]:

Copied!





# If we are satisfied, then we simply supply no feedback
further_feedack = None
graph.update_state(thread, {"human_analyst_feedback":
                            further_feedack}, as_node="human_feedback")
# If we are satisfied, then we simply supply no feedback
further_feedack = None
graph.update_state(thread, {"human_analyst_feedback":
                            further_feedack}, as_node="human_feedback")

Out[11]:

{'configurable': {'thread_id': '1',
  'checkpoint_ns': '',
  'checkpoint_id': '1ef80804-889b-6751-8004-bb0dd29aadcc'}}

In [12]:

Copied!





# Continue the graph execution to end
for event in graph.stream(None, thread, stream_mode="updates"):
    print("--Node--")
    node_name = next(iter(event.keys()))
    print(node_name)
# Continue the graph execution to end
for event in graph.stream(None, thread, stream_mode="updates"):
    print("--Node--")
    node_name = next(iter(event.keys()))
    print(node_name)

In [13]:

Copied!

final_state = graph.get_state(thread)
analysts = final_state.values.get('analysts')
final_state = graph.get_state(thread)
analysts = final_state.values.get('analysts')

In [14]:

Copied!

final_state.next
final_state.next

Out[14]:

()

In [15]:

Copied!





for analyst in analysts:
    print(f"Name: {analyst.name}")
    print(f"Affiliation: {analyst.affiliation}")
    print(f"Role: {analyst.role}")
    print(f"Description: {analyst.description}")
    print("-" * 50)
for analyst in analysts:
    print(f"Name: {analyst.name}")
    print(f"Affiliation: {analyst.affiliation}")
    print(f"Role: {analyst.role}")
    print(f"Description: {analyst.description}")
    print("-" * 50)

Name: Dr. Min-Jae Kim
Affiliation: AI Research Lab
Role: Lead AI Researcher
Description: Dr. Kim focuses on the technical advancements and performance metrics of the Llama3 series models. His primary concern is to evaluate the models' capabilities, limitations, and potential improvements.
--------------------------------------------------
Name: Soo-Jin Park
Affiliation: Tech Startup Incubator
Role: Startup Strategy Consultant
Description: Soo-Jin specializes in advising startups on integrating advanced AI models like Llama3 into their business strategies. She explores practical applications, cost-benefit analyses, and market positioning.
--------------------------------------------------
Name: Professor Hyeon-Woo Lee
Affiliation: University of Seoul
Role: AI Ethics and Policy Expert
Description: Professor Lee examines the ethical implications and policy considerations of deploying Llama3 models. His focus includes data privacy, algorithmic bias, and regulatory compliance.
--------------------------------------------------

인터뷰 진행하기¶

질문 생성¶

애널리스트들이 전문가 agent 에게 질문을 해야합니다

In [16]:

Copied!





import operator
from typing import  Annotated
from langgraph.graph import MessagesState

class InterviewState(MessagesState):
    max_num_turns: int # Number turns of conversation
    context: Annotated[list, operator.add] # Source docs
    analyst: Analyst # Analyst asking questions
    interview: str # Interview transcript
    sections: list # Final key we duplicate in outer state for Send() API

class SearchQuery(BaseModel):
    search_query: str = Field(None, description="Search query for retrieval.")
import operator
from typing import  Annotated
from langgraph.graph import MessagesState

class InterviewState(MessagesState):
    max_num_turns: int # Number turns of conversation
    context: Annotated[list, operator.add] # Source docs
    analyst: Analyst # Analyst asking questions
    interview: str # Interview transcript
    sections: list # Final key we duplicate in outer state for Send() API

class SearchQuery(BaseModel):
    search_query: str = Field(None, description="Search query for retrieval.")

In [17]:

Copied!

question_instructions = """You are an analyst tasked with interviewing an expert to learn about a specific topic.

Your goal is boil down to interesting and specific insights related to your topic.

1. Interesting: Insights that people will find surprising or non-obvious.

2. Specific: Insights that avoid generalities and include specific examples from the expert.

Here is your topic of focus and set of goals: {goals}

Begin by introducing yourself using a name that fits your persona, and then ask your question.

Continue to ask questions to drill down and refine your understanding of the topic.

When you are satisfied with your understanding, complete the interview with: "Thank you so much for your help!"

Remember to stay in character throughout your response, reflecting the persona and goals provided to you."""

def generate_question(state: InterviewState):
    """ Node to generate a question """

    # Get state
    analyst = state["analyst"]
    messages = state["messages"]

    # Generate question
    system_message = question_instructions.format(goals=analyst.persona)
    question = llm.invoke([SystemMessage(content=system_message)]+messages)

    # Write messages to state
    return {"messages": [question]}
question_instructions = """You are an analyst tasked with interviewing an expert to learn about a specific topic.

Your goal is boil down to interesting and specific insights related to your topic.

1. Interesting: Insights that people will find surprising or non-obvious.

2. Specific: Insights that avoid generalities and include specific examples from the expert.

Here is your topic of focus and set of goals: {goals}

Begin by introducing yourself using a name that fits your persona, and then ask your question.

Continue to ask questions to drill down and refine your understanding of the topic.

When you are satisfied with your understanding, complete the interview with: "Thank you so much for your help!"

Remember to stay in character throughout your response, reflecting the persona and goals provided to you."""

def generate_question(state: InterviewState):
    """ Node to generate a question """

    # Get state
    analyst = state["analyst"]
    messages = state["messages"]

    # Generate question
    system_message = question_instructions.format(goals=analyst.persona)
    question = llm.invoke([SystemMessage(content=system_message)]+messages)

    # Write messages to state
    return {"messages": [question]}

답변 생성: Parallel 하게!¶

전문가들은 다양한 소스에서 답변을 위한 정보들을 수집합니다.

웹 크롤링 (검색이 아니라 특정 페이지 파싱) WebBaseLoader
미리 정리된 문서들 RAG
웹 서치
위키피디아 서치

Tavily 같은 다른 웹 서칭 도구를 사용할 수도 있죠.

In [18]:

Copied!

def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"{var}: ")

_set_env("TAVILY_API_KEY")
def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"{var}: ")

_set_env("TAVILY_API_KEY")

TAVILY_API_KEY: ··········

In [19]:

Copied!

# Web search tool
from langchain_community.tools.tavily_search import TavilySearchResults
tavily_search = TavilySearchResults(max_results=3)
# Web search tool
from langchain_community.tools.tavily_search import TavilySearchResults
tavily_search = TavilySearchResults(max_results=3)

In [20]:

Copied!

# Wikipedia search tool
from langchain_community.document_loaders import WikipediaLoader
# Wikipedia search tool
from langchain_community.document_loaders import WikipediaLoader

위키랑 웹을 검색하는 노드를 만들겠습니다.

애널리스트에 답변을 생성해주는 노드도 만들겠습니다.

전체 인터뷰 내용을 기록하고, 요약해서 섹션으로 넘겨주는 노드도 만들겠습니다.

In [21]:

Copied!





from langchain_core.messages import get_buffer_string

# Search query writing
search_instructions = SystemMessage(content=f"""You will be given a conversation between an analyst and an expert.

Your goal is to generate a well-structured query for use in retrieval and / or web-search related to the conversation.

First, analyze the full conversation.

Pay particular attention to the final question posed by the analyst.

Convert this final question into a well-structured web search query""")

def search_web(state: InterviewState):

    """ Retrieve docs from web search """

    # Search query
    structured_llm = llm.with_structured_output(SearchQuery)
    search_query = structured_llm.invoke([search_instructions]+state['messages'])

    # Search
    search_docs = tavily_search.invoke(search_query.search_query)

     # Format
    formatted_search_docs = "\n\n---\n\n".join(
        [
            f'<Document href="{doc["url"]}"/>\n{doc["content"]}\n</Document>'
            for doc in search_docs
        ]
    )

    return {"context": [formatted_search_docs]}

def search_wikipedia(state: InterviewState):

    """ Retrieve docs from wikipedia """

    # Search query
    structured_llm = llm.with_structured_output(SearchQuery)
    search_query = structured_llm.invoke([search_instructions]+state['messages'])

    # Search
    search_docs = WikipediaLoader(query=search_query.search_query,
                                  load_max_docs=2).load()

     # Format
    formatted_search_docs = "\n\n---\n\n".join(
        [
            f'<Document source="{doc.metadata["source"]}" page="{doc.metadata.get("page", "")}"/>\n{doc.page_content}\n</Document>'
            for doc in search_docs
        ]
    )

    return {"context": [formatted_search_docs]}

answer_instructions = """You are an expert being interviewed by an analyst.

Here is analyst area of focus: {goals}.

You goal is to answer a question posed by the interviewer.

To answer question, use this context:

{context}

When answering questions, follow these guidelines:

1. Use only the information provided in the context.

2. Do not introduce external information or make assumptions beyond what is explicitly stated in the context.

3. The context contain sources at the topic of each individual document.

4. Include these sources your answer next to any relevant statements. For example, for source # 1 use [1].

5. List your sources in order at the bottom of your answer. [1] Source 1, [2] Source 2, etc

6. If the source is: <Document source="assistant/docs/llama3_1.pdf" page="7"/>' then just list:

[1] assistant/docs/llama3_1.pdf, page 7

And skip the addition of the brackets as well as the Document source preamble in your citation."""

def generate_answer(state: InterviewState):

    """ Node to answer a question """

    # Get state
    analyst = state["analyst"]
    messages = state["messages"]
    context = state["context"]

    # Answer question
    system_message = answer_instructions.format(goals=analyst.persona, context=context)
    answer = llm.invoke([SystemMessage(content=system_message)]+messages)

    # Name the message as coming from the expert
    answer.name = "expert"

    # Append it to state
    return {"messages": [answer]}

def save_interview(state: InterviewState):

    """ Save interviews """

    # Get messages
    messages = state["messages"]

    # Convert interview to a string
    interview = get_buffer_string(messages)

    # Save to interviews key
    return {"interview": interview}

def route_messages(state: InterviewState,
                   name: str = "expert"):

    """ Route between question and answer """

    # Get messages
    messages = state["messages"]
    max_num_turns = state.get('max_num_turns',2)

    # Check the number of expert answers
    num_responses = len(
        [m for m in messages if isinstance(m, AIMessage) and m.name == name]
    )

    # End if expert has answered more than the max turns
    if num_responses >= max_num_turns:
        return 'save_interview'

    # This router is run after each question - answer pair
    # Get the last question asked to check if it signals the end of discussion
    last_question = messages[-2]

    if "Thank you so much for your help" in last_question.content:
        return 'save_interview'
    return "ask_question"

section_writer_instructions = """You are an expert technical writer.

Your task is to create a short, easily digestible section of a report based on a set of source documents.

1. Analyze the content of the source documents:
- The name of each source document is at the start of the document, with the <Document tag.

2. Create a report structure using markdown formatting:
- Use ## for the section title
- Use ### for sub-section headers

3. Write the report following this structure:
a. Title (## header)
b. Summary (### header)
c. Sources (### header)

4. Make your title engaging based upon the focus area of the analyst:
{focus}

5. For the summary section:
- Set up summary with general background / context related to the focus area of the analyst
- Emphasize what is novel, interesting, or surprising about insights gathered from the interview
- Create a numbered list of source documents, as you use them
- Do not mention the names of interviewers or experts
- Aim for approximately 400 words maximum
- Use numbered sources in your report (e.g., [1], [2]) based on information from source documents

6. In the Sources section:
- Include all sources used in your report
- Provide full links to relevant websites or specific document paths
- Separate each source by a newline. Use two spaces at the end of each line to create a newline in Markdown.
- It will look like:

### Sources
[1] Link or Document name
[2] Link or Document name

7. Be sure to combine sources. For example this is not correct:

[3] https://ai.meta.com/blog/meta-llama-3-1/
[4] https://ai.meta.com/blog/meta-llama-3-1/

There should be no redundant sources. It should simply be:

[3] https://ai.meta.com/blog/meta-llama-3-1/

8. Final review:
- Ensure the report follows the required structure
- Include no preamble before the title of the report
- Check that all guidelines have been followed"""

def write_section(state: InterviewState):

    """ Node to answer a question """

    # Get state
    interview = state["interview"]
    context = state["context"]
    analyst = state["analyst"]

    # Write section using either the gathered source docs from interview (context) or the interview itself (interview)
    system_message = section_writer_instructions.format(focus=analyst.description)
    section = llm.invoke([SystemMessage(content=system_message)]+[HumanMessage(content=f"Use this source to write your section: {context}")])

    # Append it to state
    return {"sections": [section.content]}

# Add nodes and edges
interview_builder = StateGraph(InterviewState)
interview_builder.add_node("ask_question", generate_question)
interview_builder.add_node("search_web", search_web)
interview_builder.add_node("search_wikipedia", search_wikipedia)
interview_builder.add_node("answer_question", generate_answer)
interview_builder.add_node("save_interview", save_interview)
interview_builder.add_node("write_section", write_section)

# Flow
interview_builder.add_edge(START, "ask_question")
interview_builder.add_edge("ask_question", "search_web")
interview_builder.add_edge("ask_question", "search_wikipedia")
interview_builder.add_edge("search_web", "answer_question")
interview_builder.add_edge("search_wikipedia", "answer_question")
interview_builder.add_conditional_edges("answer_question", route_messages,['ask_question','save_interview'])
interview_builder.add_edge("save_interview", "write_section")
interview_builder.add_edge("write_section", END)

# Interview
memory = MemorySaver()
interview_graph = interview_builder.compile(checkpointer=memory).with_config(run_name="Conduct Interviews")

# View
display(Image(interview_graph.get_graph().draw_mermaid_png()))
from langchain_core.messages import get_buffer_string

# Search query writing
search_instructions = SystemMessage(content=f"""You will be given a conversation between an analyst and an expert.

Your goal is to generate a well-structured query for use in retrieval and / or web-search related to the conversation.

First, analyze the full conversation.

Pay particular attention to the final question posed by the analyst.

Convert this final question into a well-structured web search query""")

def search_web(state: InterviewState):

    """ Retrieve docs from web search """

    # Search query
    structured_llm = llm.with_structured_output(SearchQuery)
    search_query = structured_llm.invoke([search_instructions]+state['messages'])

    # Search
    search_docs = tavily_search.invoke(search_query.search_query)

     # Format
    formatted_search_docs = "\n\n---\n\n".join(
        [
            f'\n{doc["content"]}\n'
            for doc in search_docs
        ]
    )

    return {"context": [formatted_search_docs]}

def search_wikipedia(state: InterviewState):

    """ Retrieve docs from wikipedia """

    # Search query
    structured_llm = llm.with_structured_output(SearchQuery)
    search_query = structured_llm.invoke([search_instructions]+state['messages'])

    # Search
    search_docs = WikipediaLoader(query=search_query.search_query,
                                  load_max_docs=2).load()

     # Format
    formatted_search_docs = "\n\n---\n\n".join(
        [
            f'\n{doc.page_content}\n'
            for doc in search_docs
        ]
    )

    return {"context": [formatted_search_docs]}

answer_instructions = """You are an expert being interviewed by an analyst.

Here is analyst area of focus: {goals}.

You goal is to answer a question posed by the interviewer.

To answer question, use this context:

{context}

When answering questions, follow these guidelines:

1. Use only the information provided in the context.

2. Do not introduce external information or make assumptions beyond what is explicitly stated in the context.

3. The context contain sources at the topic of each individual document.

4. Include these sources your answer next to any relevant statements. For example, for source # 1 use [1].

5. List your sources in order at the bottom of your answer. [1] Source 1, [2] Source 2, etc

6. If the source is: ' then just list:

[1] assistant/docs/llama3_1.pdf, page 7

And skip the addition of the brackets as well as the Document source preamble in your citation."""

def generate_answer(state: InterviewState):

    """ Node to answer a question """

    # Get state
    analyst = state["analyst"]
    messages = state["messages"]
    context = state["context"]

    # Answer question
    system_message = answer_instructions.format(goals=analyst.persona, context=context)
    answer = llm.invoke([SystemMessage(content=system_message)]+messages)

    # Name the message as coming from the expert
    answer.name = "expert"

    # Append it to state
    return {"messages": [answer]}

def save_interview(state: InterviewState):

    """ Save interviews """

    # Get messages
    messages = state["messages"]

    # Convert interview to a string
    interview = get_buffer_string(messages)

    # Save to interviews key
    return {"interview": interview}

def route_messages(state: InterviewState,
                   name: str = "expert"):

    """ Route between question and answer """

    # Get messages
    messages = state["messages"]
    max_num_turns = state.get('max_num_turns',2)

    # Check the number of expert answers
    num_responses = len(
        [m for m in messages if isinstance(m, AIMessage) and m.name == name]
    )

    # End if expert has answered more than the max turns
    if num_responses >= max_num_turns:
        return 'save_interview'

    # This router is run after each question - answer pair
    # Get the last question asked to check if it signals the end of discussion
    last_question = messages[-2]

    if "Thank you so much for your help" in last_question.content:
        return 'save_interview'
    return "ask_question"

section_writer_instructions = """You are an expert technical writer.

Your task is to create a short, easily digestible section of a report based on a set of source documents.

1. Analyze the content of the source documents:
- The name of each source document is at the start of the document, with the 

In [24]:

Copied!

# Pick one analyst
analysts[0]
# Pick one analyst
analysts[0]

Out[24]:

Analyst(affiliation='AI Research Lab', name='Dr. Min-Jae Kim', role='Lead AI Researcher', description="Dr. Kim focuses on the technical advancements and performance metrics of the Llama3 series models. His primary concern is to evaluate the models' capabilities, limitations, and potential improvements.")

In [25]:

Copied!





from IPython.display import Markdown
messages = [HumanMessage(f"So you said you were writing an article on {topic}?")]
thread = {"configurable": {"thread_id": "1"}}
interview = interview_graph.invoke({"analyst": analysts[0], "messages": messages, "max_num_turns": 2}, thread)
Markdown(interview['sections'][0])
from IPython.display import Markdown
messages = [HumanMessage(f"So you said you were writing an article on {topic}?")]
thread = {"configurable": {"thread_id": "1"}}
interview = interview_graph.invoke({"analyst": analysts[0], "messages": messages, "max_num_turns": 2}, thread)
Markdown(interview['sections'][0])

Out[25]:

Evaluating the Technical Advancements and Performance Metrics of Llama3 Series Models¶

Summary¶

The Llama3 series, developed by Meta AI, represents a significant advancement in the field of large language models (LLMs). Positioned as a competitor to OpenAI's GPT series, Llama3 models are designed to excel in various natural language processing (NLP) tasks such as text generation, conversation, and summarization [1]. This report delves into the capabilities, limitations, and potential improvements of the Llama3 series, with a particular focus on the Llama3.1 model.

Llama3.1, the latest iteration, boasts an impressive 405 billion parameters, marking a substantial leap from its predecessors [2]. This model is designed to be versatile and powerful, capable of handling a wide range of tasks with improved performance metrics. Notably, Llama3.1 outperforms Llama3 in several benchmarks, including math tasks where it shows a 14% improvement [3]. This enhancement is attributed to its larger context window and better NLP capabilities, making it a more suitable choice for tasks requiring extensive context [4].

One of the most surprising insights is the performance of the smaller Llama3 8B model, which, despite being ten times smaller than the Llama2 70B, produces similar results [2]. This indicates that Meta AI has made significant strides in optimizing model efficiency without compromising performance.

Key improvements in Llama3.1 include faster performance, broader versatility, and enhanced ease of use. These advancements make it a worthy upgrade for those looking to leverage the power of AI in their projects [5]. However, it's important to note that while Llama3.1 offers numerous benefits, Llama3 still holds its own, particularly in specific use cases where its capabilities are sufficient [5].

In summary, the Llama3 series, particularly the Llama3.1 model, showcases Meta AI's commitment to advancing the field of artificial intelligence. The models' capabilities, combined with their performance metrics, make them strong contenders in the competitive landscape of LLMs.

Sources¶

[1] https://kili-technology.com/large-language-models-llms/llama-3-guide-everything-you-need-to-know-about-meta-s-new-model-and-its-data
[2] https://medium.com/@soumava.dey.aig/decoding-llama-3-a-quick-overview-of-the-model-7e69abcdbe6a
[3] https://medium.com/@getanakin/llama3-vs-llama-comprehensive-review-of-benchmarks-and-pricing-1767fd1bd04a
[4] https://medium.com/@kagglepro/llama-3-vs-llama-3-1-which-is-the-better-fit-for-your-ai-projects-a57a052b89b1
[5] http://anakin.ai/blog/llama3-1-vs-llama-3/

인터뷰들 취합, Map-Reduce¶

Map 단계에서 인터뷰들을 Send() API 로 쭉 보냈습니다.

이제 Reduce 단계로 인터뷰들을 취합해서 보고서로 만들겠습니다.

Finalize¶

마무리를 위해 보고서의 인트로, 결론을 작성합니다.

In [26]:

Copied!





import operator
from typing import List, Annotated
from typing_extensions import TypedDict

class ResearchGraphState(TypedDict):
    topic: str # Research topic
    max_analysts: int # Number of analysts
    human_analyst_feedback: str # Human feedback
    analysts: List[Analyst] # Analyst asking questions
    sections: Annotated[list, operator.add] # Send() API key
    introduction: str # Introduction for the final report
    content: str # Content for the final report
    conclusion: str # Conclusion for the final report
    final_report: str # Final report
import operator
from typing import List, Annotated
from typing_extensions import TypedDict

class ResearchGraphState(TypedDict):
    topic: str # Research topic
    max_analysts: int # Number of analysts
    human_analyst_feedback: str # Human feedback
    analysts: List[Analyst] # Analyst asking questions
    sections: Annotated[list, operator.add] # Send() API key
    introduction: str # Introduction for the final report
    content: str # Content for the final report
    conclusion: str # Conclusion for the final report
    final_report: str # Final report

In [27]:

Copied!





from langgraph.constants import Send

def initiate_all_interviews(state: ResearchGraphState):
    """ This is the "map" step where we run each interview sub-graph using Send API """

    # Check if human feedback
    human_analyst_feedback=state.get('human_analyst_feedback')
    if human_analyst_feedback:
        # Return to create_analysts
        return "create_analysts"

    # Otherwise kick off interviews in parallel via Send() API
    else:
        topic = state["topic"]
        return [Send("conduct_interview", {"analyst": analyst,
                                           "messages": [HumanMessage(
                                               content=f"So you said you were writing an article on {topic}?"
                                           )
                                                       ]}) for analyst in state["analysts"]]

report_writer_instructions = """You are a technical writer creating a report on this overall topic:

{topic}

You have a team of analysts. Each analyst has done two things:

1. They conducted an interview with an expert on a specific sub-topic.
2. They write up their finding into a memo.

Your task:

1. You will be given a collection of memos from your analysts.
2. Think carefully about the insights from each memo.
3. Consolidate these into a crisp overall summary that ties together the central ideas from all of the memos.
4. Summarize the central points in each memo into a cohesive single narrative.

To format your report:

1. Use markdown formatting.
2. Include no pre-amble for the report.
3. Use no sub-heading.
4. Start your report with a single title header: ## Insights
5. Do not mention any analyst names in your report.
6. Preserve any citations in the memos, which will be annotated in brackets, for example [1] or [2].
7. Create a final, consolidated list of sources and add to a Sources section with the `## Sources` header.
8. List your sources in order and do not repeat.

[1] Source 1
[2] Source 2

Here are the memos from your analysts to build your report from:

{context}"""

def write_report(state: ResearchGraphState):
    # Full set of sections
    sections = state["sections"]
    topic = state["topic"]

    # Concat all sections together
    formatted_str_sections = "\n\n".join([f"{section}" for section in sections])

    # Summarize the sections into a final report
    system_message = report_writer_instructions.format(topic=topic, context=formatted_str_sections)
    report = llm.invoke([SystemMessage(content=system_message)]+[HumanMessage(content=f"Write a report based upon these memos.")])
    return {"content": report.content}

intro_conclusion_instructions = """You are a technical writer finishing a report on {topic}

You will be given all of the sections of the report.

You job is to write a crisp and compelling introduction or conclusion section.

The user will instruct you whether to write the introduction or conclusion.

Include no pre-amble for either section.

Target around 100 words, crisply previewing (for introduction) or recapping (for conclusion) all of the sections of the report.

Use markdown formatting.

For your introduction, create a compelling title and use the # header for the title.

For your introduction, use ## Introduction as the section header.

For your conclusion, use ## Conclusion as the section header.

Here are the sections to reflect on for writing: {formatted_str_sections}"""

def write_introduction(state: ResearchGraphState):
    # Full set of sections
    sections = state["sections"]
    topic = state["topic"]

    # Concat all sections together
    formatted_str_sections = "\n\n".join([f"{section}" for section in sections])

    # Summarize the sections into a final report

    instructions = intro_conclusion_instructions.format(topic=topic, formatted_str_sections=formatted_str_sections)
    intro = llm.invoke([instructions]+[HumanMessage(content=f"Write the report introduction")])
    return {"introduction": intro.content}

def write_conclusion(state: ResearchGraphState):
    # Full set of sections
    sections = state["sections"]
    topic = state["topic"]

    # Concat all sections together
    formatted_str_sections = "\n\n".join([f"{section}" for section in sections])

    # Summarize the sections into a final report

    instructions = intro_conclusion_instructions.format(topic=topic, formatted_str_sections=formatted_str_sections)
    conclusion = llm.invoke([instructions]+[HumanMessage(content=f"Write the report conclusion")])
    return {"conclusion": conclusion.content}

def finalize_report(state: ResearchGraphState):
    """ The is the "reduce" step where we gather all the sections, combine them, and reflect on them to write the intro/conclusion """
    # Save full final report
    content = state["content"]
    if content.startswith("## Insights"):
        content = content.strip("## Insights")
    if "## Sources" in content:
        try:
            content, sources = content.split("\n## Sources\n")
        except:
            sources = None
    else:
        sources = None

    final_report = state["introduction"] + "\n\n---\n\n" + content + "\n\n---\n\n" + state["conclusion"]
    if sources is not None:
        final_report += "\n\n## Sources\n" + sources
    return {"final_report": final_report}

# Add nodes and edges
builder = StateGraph(ResearchGraphState)
builder.add_node("create_analysts", create_analysts)
builder.add_node("human_feedback", human_feedback)
builder.add_node("conduct_interview", interview_builder.compile())
builder.add_node("write_report",write_report)
builder.add_node("write_introduction",write_introduction)
builder.add_node("write_conclusion",write_conclusion)
builder.add_node("finalize_report",finalize_report)

# Logic
builder.add_edge(START, "create_analysts")
builder.add_edge("create_analysts", "human_feedback")
builder.add_conditional_edges("human_feedback", initiate_all_interviews, ["create_analysts", "conduct_interview"])
builder.add_edge("conduct_interview", "write_report")
builder.add_edge("conduct_interview", "write_introduction")
builder.add_edge("conduct_interview", "write_conclusion")
builder.add_edge(["write_conclusion", "write_report", "write_introduction"], "finalize_report")
builder.add_edge("finalize_report", END)

# Compile
memory = MemorySaver()
graph = builder.compile(interrupt_before=['human_feedback'], checkpointer=memory)
display(Image(graph.get_graph(xray=1).draw_mermaid_png()))
from langgraph.constants import Send

def initiate_all_interviews(state: ResearchGraphState):
    """ This is the "map" step where we run each interview sub-graph using Send API """

    # Check if human feedback
    human_analyst_feedback=state.get('human_analyst_feedback')
    if human_analyst_feedback:
        # Return to create_analysts
        return "create_analysts"

    # Otherwise kick off interviews in parallel via Send() API
    else:
        topic = state["topic"]
        return [Send("conduct_interview", {"analyst": analyst,
                                           "messages": [HumanMessage(
                                               content=f"So you said you were writing an article on {topic}?"
                                           )
                                                       ]}) for analyst in state["analysts"]]

report_writer_instructions = """You are a technical writer creating a report on this overall topic:

{topic}

You have a team of analysts. Each analyst has done two things:

1. They conducted an interview with an expert on a specific sub-topic.
2. They write up their finding into a memo.

Your task:

1. You will be given a collection of memos from your analysts.
2. Think carefully about the insights from each memo.
3. Consolidate these into a crisp overall summary that ties together the central ideas from all of the memos.
4. Summarize the central points in each memo into a cohesive single narrative.

To format your report:

1. Use markdown formatting.
2. Include no pre-amble for the report.
3. Use no sub-heading.
4. Start your report with a single title header: ## Insights
5. Do not mention any analyst names in your report.
6. Preserve any citations in the memos, which will be annotated in brackets, for example [1] or [2].
7. Create a final, consolidated list of sources and add to a Sources section with the `## Sources` header.
8. List your sources in order and do not repeat.

[1] Source 1
[2] Source 2

Here are the memos from your analysts to build your report from:

{context}"""

def write_report(state: ResearchGraphState):
    # Full set of sections
    sections = state["sections"]
    topic = state["topic"]

    # Concat all sections together
    formatted_str_sections = "\n\n".join([f"{section}" for section in sections])

    # Summarize the sections into a final report
    system_message = report_writer_instructions.format(topic=topic, context=formatted_str_sections)
    report = llm.invoke([SystemMessage(content=system_message)]+[HumanMessage(content=f"Write a report based upon these memos.")])
    return {"content": report.content}

intro_conclusion_instructions = """You are a technical writer finishing a report on {topic}

You will be given all of the sections of the report.

You job is to write a crisp and compelling introduction or conclusion section.

The user will instruct you whether to write the introduction or conclusion.

Include no pre-amble for either section.

Target around 100 words, crisply previewing (for introduction) or recapping (for conclusion) all of the sections of the report.

Use markdown formatting.

For your introduction, create a compelling title and use the # header for the title.

For your introduction, use ## Introduction as the section header.

For your conclusion, use ## Conclusion as the section header.

Here are the sections to reflect on for writing: {formatted_str_sections}"""

def write_introduction(state: ResearchGraphState):
    # Full set of sections
    sections = state["sections"]
    topic = state["topic"]

    # Concat all sections together
    formatted_str_sections = "\n\n".join([f"{section}" for section in sections])

    # Summarize the sections into a final report

    instructions = intro_conclusion_instructions.format(topic=topic, formatted_str_sections=formatted_str_sections)
    intro = llm.invoke([instructions]+[HumanMessage(content=f"Write the report introduction")])
    return {"introduction": intro.content}

def write_conclusion(state: ResearchGraphState):
    # Full set of sections
    sections = state["sections"]
    topic = state["topic"]

    # Concat all sections together
    formatted_str_sections = "\n\n".join([f"{section}" for section in sections])

    # Summarize the sections into a final report

    instructions = intro_conclusion_instructions.format(topic=topic, formatted_str_sections=formatted_str_sections)
    conclusion = llm.invoke([instructions]+[HumanMessage(content=f"Write the report conclusion")])
    return {"conclusion": conclusion.content}

def finalize_report(state: ResearchGraphState):
    """ The is the "reduce" step where we gather all the sections, combine them, and reflect on them to write the intro/conclusion """
    # Save full final report
    content = state["content"]
    if content.startswith("## Insights"):
        content = content.strip("## Insights")
    if "## Sources" in content:
        try:
            content, sources = content.split("\n## Sources\n")
        except:
            sources = None
    else:
        sources = None

    final_report = state["introduction"] + "\n\n---\n\n" + content + "\n\n---\n\n" + state["conclusion"]
    if sources is not None:
        final_report += "\n\n## Sources\n" + sources
    return {"final_report": final_report}

# Add nodes and edges
builder = StateGraph(ResearchGraphState)
builder.add_node("create_analysts", create_analysts)
builder.add_node("human_feedback", human_feedback)
builder.add_node("conduct_interview", interview_builder.compile())
builder.add_node("write_report",write_report)
builder.add_node("write_introduction",write_introduction)
builder.add_node("write_conclusion",write_conclusion)
builder.add_node("finalize_report",finalize_report)

# Logic
builder.add_edge(START, "create_analysts")
builder.add_edge("create_analysts", "human_feedback")
builder.add_conditional_edges("human_feedback", initiate_all_interviews, ["create_analysts", "conduct_interview"])
builder.add_edge("conduct_interview", "write_report")
builder.add_edge("conduct_interview", "write_introduction")
builder.add_edge("conduct_interview", "write_conclusion")
builder.add_edge(["write_conclusion", "write_report", "write_introduction"], "finalize_report")
builder.add_edge("finalize_report", END)

# Compile
memory = MemorySaver()
graph = builder.compile(interrupt_before=['human_feedback'], checkpointer=memory)
display(Image(graph.get_graph(xray=1).draw_mermaid_png()))

Let's ask an open-ended question about LangGraph.

In [29]:

Copied!





# Inputs
max_analysts = 3
topic = "Llama3, Llama3.1, Llama3.2 모델들에 대한 분석"
thread = {"configurable": {"thread_id": "1"}}

# Run the graph until the first interruption
for event in graph.stream({"topic":topic,
                           "max_analysts":max_analysts},
                          thread,
                          stream_mode="values"):

    analysts = event.get('analysts', '')
    if analysts:
        for analyst in analysts:
            print(f"Name: {analyst.name}")
            print(f"Affiliation: {analyst.affiliation}")
            print(f"Role: {analyst.role}")
            print(f"Description: {analyst.description}")
            print("-" * 50)
# Inputs
max_analysts = 3
topic = "Llama3, Llama3.1, Llama3.2 모델들에 대한 분석"
thread = {"configurable": {"thread_id": "1"}}

# Run the graph until the first interruption
for event in graph.stream({"topic":topic,
                           "max_analysts":max_analysts},
                          thread,
                          stream_mode="values"):

    analysts = event.get('analysts', '')
    if analysts:
        for analyst in analysts:
            print(f"Name: {analyst.name}")
            print(f"Affiliation: {analyst.affiliation}")
            print(f"Role: {analyst.role}")
            print(f"Description: {analyst.description}")
            print("-" * 50)

Name: Dr. Emily Carter
Affiliation: Tech Innovators Inc.
Role: Technology Adoption Specialist
Description: Dr. Carter focuses on the strategic benefits and challenges of adopting new technologies in enterprise environments. She is particularly interested in how LangGraph can streamline operations and improve efficiency.
--------------------------------------------------
Name: Raj Patel
Affiliation: Data Security Solutions
Role: Cybersecurity Analyst
Description: Raj Patel examines the security implications of adopting new frameworks like LangGraph. His primary concern is ensuring that the integration of LangGraph does not introduce vulnerabilities and that it enhances overall system security.
--------------------------------------------------
Name: Dr. Maria Gonzalez
Affiliation: AI Research Lab
Role: AI Ethics Researcher
Description: Dr. Gonzalez explores the ethical considerations of implementing AI frameworks such as LangGraph. She is focused on ensuring that the adoption of LangGraph aligns with ethical standards and promotes fair and unbiased AI practices.
--------------------------------------------------
Name: Dr. Emily Carter
Affiliation: OpenAI
Role: Model Performance Specialist
Description: Dr. Carter focuses on the performance metrics and benchmarks of AI models. She is particularly interested in the comparative analysis of Llama3, Llama3.1, and Llama3.2, examining their efficiency, accuracy, and scalability.
--------------------------------------------------
Name: Prof. John Smith
Affiliation: MIT
Role: Ethics and Bias Analyst
Description: Prof. Smith's research centers on the ethical implications and potential biases in AI models. He will analyze Llama3, Llama3.1, and Llama3.2 for any inherent biases and ethical concerns, providing insights into their societal impacts.
--------------------------------------------------
Name: Dr. Alice Wong
Affiliation: Google Research
Role: Innovation and Development Expert
Description: Dr. Wong is an expert in AI innovation and development. She will explore the technological advancements and novel features introduced in Llama3, Llama3.1, and Llama3.2, highlighting their contributions to the field of AI.
--------------------------------------------------

In [30]:

Copied!

# We now update the state as if we are the human_feedback node
graph.update_state(thread, {"human_analyst_feedback":
                                "Add in the CEO of gen ai native startup"}, as_node="human_feedback")
# We now update the state as if we are the human_feedback node
graph.update_state(thread, {"human_analyst_feedback":
                                "Add in the CEO of gen ai native startup"}, as_node="human_feedback")

Out[30]:

{'configurable': {'thread_id': '1',
  'checkpoint_ns': '',
  'checkpoint_id': '1ef8083c-f643-6809-8005-779be80ccc9d'}}

In [31]:

Copied!





# Check
for event in graph.stream(None, thread, stream_mode="values"):
    analysts = event.get('analysts', '')
    if analysts:
        for analyst in analysts:
            print(f"Name: {analyst.name}")
            print(f"Affiliation: {analyst.affiliation}")
            print(f"Role: {analyst.role}")
            print(f"Description: {analyst.description}")
            print("-" * 50)
# Check
for event in graph.stream(None, thread, stream_mode="values"):
    analysts = event.get('analysts', '')
    if analysts:
        for analyst in analysts:
            print(f"Name: {analyst.name}")
            print(f"Affiliation: {analyst.affiliation}")
            print(f"Role: {analyst.role}")
            print(f"Description: {analyst.description}")
            print("-" * 50)

Name: Dr. Emily Carter
Affiliation: OpenAI
Role: Model Performance Specialist
Description: Dr. Carter focuses on the performance metrics and benchmarks of AI models. She is particularly interested in the comparative analysis of Llama3, Llama3.1, and Llama3.2, examining their efficiency, accuracy, and scalability.
--------------------------------------------------
Name: Prof. John Smith
Affiliation: MIT
Role: Ethics and Bias Analyst
Description: Prof. Smith's research centers on the ethical implications and potential biases in AI models. He will analyze Llama3, Llama3.1, and Llama3.2 for any inherent biases and ethical concerns, providing insights into their societal impacts.
--------------------------------------------------
Name: Dr. Alice Wong
Affiliation: Google Research
Role: Innovation and Development Expert
Description: Dr. Wong is an expert in AI innovation and development. She will explore the technological advancements and novel features introduced in Llama3, Llama3.1, and Llama3.2, highlighting their contributions to the field of AI.
--------------------------------------------------
Name: Alex Kim
Affiliation: Gen AI Native Startup
Role: CEO
Description: Alex is the CEO of a startup that specializes in generative AI technologies. He is focused on the commercial applications and market potential of the Llama3 series models. His primary concern is how these models can be leveraged to create innovative products and services that can disrupt existing markets.
--------------------------------------------------
Name: Dr. Emily Park
Affiliation: AI Research Institute
Role: Lead Research Scientist
Description: Dr. Park is a lead research scientist at a prominent AI research institute. Her focus is on the technical advancements and innovations in the Llama3 series models. She is particularly interested in the architectural improvements and performance metrics of Llama3, Llama3.1, and Llama3.2.
--------------------------------------------------
Name: John Lee
Affiliation: Tech Media
Role: Tech Journalist
Description: John is a seasoned tech journalist who writes for a leading technology media outlet. His focus is on the broader implications of the Llama3 series models in the tech industry. He is interested in how these models compare to other state-of-the-art AI models and their potential impact on various sectors.
--------------------------------------------------

In [32]:

Copied!

# Confirm we are happy
graph.update_state(thread, {"human_analyst_feedback":
                            None}, as_node="human_feedback")
# Confirm we are happy
graph.update_state(thread, {"human_analyst_feedback":
                            None}, as_node="human_feedback")

Out[32]:

{'configurable': {'thread_id': '1',
  'checkpoint_ns': '',
  'checkpoint_id': '1ef8083d-d45d-6946-8007-0b1d6cf236ee'}}

In [33]:

Copied!





# Continue
for event in graph.stream(None, thread, stream_mode="updates"):
    print("--Node--")
    node_name = next(iter(event.keys()))
    print(node_name)
# Continue
for event in graph.stream(None, thread, stream_mode="updates"):
    print("--Node--")
    node_name = next(iter(event.keys()))
    print(node_name)

--Node--
conduct_interview
--Node--
conduct_interview
--Node--
conduct_interview
--Node--
write_conclusion
--Node--
write_introduction
--Node--
write_report
--Node--
finalize_report

In [34]:

Copied!





from IPython.display import Markdown
final_state = graph.get_state(thread)
report = final_state.values.get('final_report')
Markdown(report)
from IPython.display import Markdown
final_state = graph.get_state(thread)
report = final_state.values.get('final_report')
Markdown(report)

Out[34]:

Leveraging Llama 3 for Market Disruption: Insights for Generative AI Applications¶

Introduction¶

The Llama 3 series, developed by Meta, marks a significant leap in generative AI, offering enhanced performance and new capabilities for commercial applications. This report explores how these models can be harnessed to create innovative products and services that disrupt existing markets. We delve into the architectural advancements and performance metrics of Llama 3, Llama 3.1, and Llama 3.2, highlighting their superior language processing abilities. Additionally, we examine the broader implications of these models in the tech industry, including their state-of-the-art capabilities, multilingual support, and optimization for edge devices.

The Llama 3 series, developed by Meta, marks a significant leap in the field of generative AI, offering enhanced performance and new capabilities that can be leveraged for commercial applications. These models, including Llama 3, Llama 3.1, and Llama 3.2, are designed to rival top AI models in various capabilities, such as general knowledge, steerability, math, tool use, and multilingual translation.

Llama 3 is a text-generation AI model similar to OpenAI's GPT and Anthropic's Claude models, generating text responses based on given prompts with notable improvements in contextual understanding and logical reasoning [1]. The series includes models with 8 billion and 70 billion parameters, designed to enhance processing power, versatility, and accessibility [2]. Key improvements in Llama 3 include the use of a tokenizer with a vocabulary of 128K tokens, which encodes language more efficiently, leading to substantially improved model performance. Additionally, the adoption of grouped query attention (GQA) across both the 8B and 70B sizes has improved inference efficiency [3].

One of the most groundbreaking aspects of Llama 3 is its open-source nature, which democratizes access to advanced AI capabilities and fosters innovation. The release of the 405B model, Llama 3.1, further pushes the boundaries by offering state-of-the-art capabilities in general knowledge, steerability, math, tool use, and multilingual translation [4]. This model is poised to supercharge innovation, providing unprecedented opportunities for growth and exploration [5].

Llama 3 introduces several key improvements over its predecessor, Llama 2, starting with its architecture. While maintaining a relatively standard decoder-only transformer architecture, Llama 3 incorporates a tokenizer with a vocabulary of 128K tokens, which encodes language more efficiently and leads to substantially improved model performance [1]. The training data for Llama 3 has also seen a significant increase, with a pretraining corpus expanded by 650%, providing a much richer dataset for the model to learn from [2]. This expansion in training data contributes to Llama 3's enhanced understanding and generation of language.

In real-world applications, Llama 3 has demonstrated notable improvements in the speed and accuracy of language tasks. These enhancements are not just theoretical but have been observed in practical scenarios, showcasing Llama 3's superior performance in handling complex language tasks [3]. However, it is worth noting that despite these advancements, Llama 3's performance in competitive benchmarks has shown variability, with win rates dropping from a high 50% to a low 40% in some cases [4]. Llama 3 also excels in multi-step tasks due to refined post-training processes that minimize false rejections, improve response alignment, and generate more diverse answers, making it more robust and versatile in handling a variety of language tasks compared to Llama 2 [5].

The Llama 3 series models are designed to support a wide range of languages and tasks, including coding, reasoning, and tool usage, making them highly versatile and applicable across various industries, from healthcare to finance [2][3]. The Llama 3.2 models, with 1B and 3B parameters, are optimized for on-device use cases, supporting a context length of 128K tokens, making them ideal for tasks such as summarization, instruction following, and rewriting on mobile and edge devices. They are also optimized for Qualcomm and MediaTek hardware, ensuring broad compatibility and efficient performance [3].

The development of LLMs has seen significant advancements since the introduction of the transformer architecture in 2017. The Llama 3 series builds on this foundation, offering models that are not only powerful but also openly available, which contrasts with the more restricted access of models like GPT-3 and GPT-4 [4]. By fine-tuning the Llama 3 models on data specific to particular industries, it is possible to create custom AI solutions that address unique challenges. For instance, in healthcare, these models can assist in medical image analysis, while in finance, they can be used for predictive analytics and risk assessment [5][6].

Meta has emphasized the importance of safety in the development of the Llama 3.1 models. This focus on safety is crucial as these models are deployed in various sensitive applications, ensuring that they operate within ethical guidelines and minimize potential risks [5]. The Llama 3 series models are poised to have a significant impact on the tech industry, offering advanced capabilities and broad applicability across multiple sectors. Their open availability and optimization for edge devices further enhance their potential to drive innovation and address complex challenges in various domains.

Conclusion¶

The Llama 3 series, developed by Meta, marks a significant leap in generative AI, offering enhanced performance and new capabilities that can disrupt existing markets. This report has explored the potential of Llama 3 for market disruption, highlighting its open-source nature and advanced features. We delved into the architectural improvements and performance metrics of Llama3, Llama3.1, and Llama3.2, noting substantial advancements over Llama2. Additionally, we examined the broader implications of these models in the tech industry, emphasizing their state-of-the-art capabilities, multilingual support, and optimization for edge devices. The Llama 3 series stands poised to drive innovation and address complex challenges across various sectors.

Sources¶

[1] https://www.datacamp.com/blog/meta-announces-llama-3-the-next-generation-of-open-source-llms
[2] https://techcrunch.com/2024/04/18/meta-releases-llama-3-claims-its-among-the-best-open-models-available/
[3] https://ai.meta.com/blog/meta-llama-3/
[4] https://builtin.com/articles/llama-3
[5] https://ai.meta.com/blog/meta-llama-3-1/
[6] https://www.unite.ai/everything-you-need-to-know-about-llama-3-most-powerful-open-source-model-yet-concepts-to-usage/
[7] https://kanerika.com/blogs/llama-3-vs-llama-2/
[8] https://www.linkedin.com/pulse/comprehensive-technical-analysis-llama-3-comparison-2-ibad-rehman-kw8pe
[9] https://lmsys.org/blog/2024-05-08-llama3/
[10] https://blog.monsterapi.ai/blogs/what-is-llama-3-and-how-it-differs-from-llama-2/
[11] https://arxiv.org/abs/2407.21783
[12] https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/
[13] https://en.wikipedia.org/wiki/Large_language_model
[14] https://www.datacamp.com/blog/llama-3-1-405b-meta-ai
[15] https://www.indikaai.com/blog/the-llama-3-herd-of-models-a-revolution-in-multilingual-ai-and-beyond

트레이스를 한번 보겠습니다.

https://smith.langchain.com/public/78cc6a69-a654-42b4-8865-e58fb5fd8a3c/r

3번의 인터뷰 (conduct_interview) 와 report 를 write 하는 노드들 까지 각각 잘 수행된 것을 볼 수 있습니다.

작성된 결과물 report가 잘 만들어졌냐. 고 하면 애매합니다.

llama3 과 llama3.1 에 대한 좋은 정보가 잘 담겼습니다.
- 원 논문인 herd of models 를 보지는 않았지만, 그래도 재해석된 블로그글을 참조했습니다.
- GQA 라던가 405B 모델 등 기타 주요 사항들은 잘 담겼씁니다.
문제는 llama3.2 에 대한 정보입니다.
- 아주 작은 모델이 포함되었다는점, cross-attention을 vision input 추가에 활용했다는 점이 전혀 담기지 않았습니다.

문제 원인을 분석해보자면, 제가 이 코드를 수행하는 현 시점 (24.10.02) 기준 llama3.2 는 나온지 7일된 아주 새로운 모델입니다. 해당 모델을 재해석한 popular 한 글들인 웹상에 존재하지 않아서 검색이 안 된 것으로 추정이 됩니다.... ㅜㅠ 이를 해결해야겠죠?

LangGraph Studio 로 시도하기¶

위 그래프 코드를 py 코드로 만들어서 studio 프로젝트로 넣어두면 아래와 같이 UI 와 함께 사용할 수 있습니다.

py code 는 https://github.com/langchain-ai/langchain-academy/tree/main/module-4/studio 를 참조하시고요.

Llama3.2 에 대한 자료가 아쉬워서 다시 조사를 시켜봤습니다.

그랬더니 Llama3.2 에 대해 결과를 내 뱉었고, 공식 meta의 Llama3.2 자료를 잘 참조했습니다.

그러나, Llama 3 과 3.2 를 제대로 구분하지 못해서 내용이 어지럽고 틀린 소리가 많아요. 탈락입니다.

search 단계에 human in the loop 을 넣어서 필요 없는 자료들을 처내주거나 하는 방안을 추가해야할 것 같습니다.

결과 참조 링크는 여기 있습니다.

https://smith.langchain.com/public/be152826-ae78-4784-a622-97b2c0e3bcdb/r

결론

좋은 research를 구현할 수 있었으나, 완벽하지 않다. 이를 해결하기 위해 다음 방법을 찾아가 보겠습니다.

다음 주제는 Reflection 입니다.