Research Assistant¶
Review¶
아래와 같이 LangGraph 의 주요 컨셉에 대해서 배웠습니다.
- Memory
- Human-in-the-loop
- Controllability
이제 3가지를 합쳐서, 가장 유명한 AI 사용 법인 "research automation"을 구현해 보겠습니다
리서치는 애널리스트 입장에서 노동집약적인 일입니다. AI 로 이를 도와줘 보겠습니다.
리서치 작업을 위해 해결해야할 문제가 있습니다. 보통 LLM 의 output은 우리의 현실 문제와 align 이 잘 안되어 있는 경우가 많죠. 뻔한 소리를 한다거나, 자료의 출처를 정확하게 잘 찾고 대답하지 못 한다거나...
research and report generation 문서를 참조하면, 꽤 좋은 workflow 예시가 있습니다. 이 방법론을 LangGraph를 이용해서 구현해 보겠습니다.
Goal¶
목표는 다음과 같습니다.
리서치 작업을 해주는 가벼운 multi-agent 시스템을 구현하는 것 입니다.
Source Selection
- 유저는 인풋 소스를 골라줍니다.
Planning
- 사용자는 토픽을 지정해주고, 시스템은 AI 애널리스트 "팀" 을 만듧니다. 애널리스트들은 서브토픽들을 맡아서 분석합니다.
Human-in-the-loop
단계에서 서브토픽을 사람이 지정해 줄 예정입니다. 일을 쪼개는 것은 사람이 직접 하는 것이 효과적이군요.
LLM Utilization
- 각각의 애널리스트들은 또 다른 전문가 AI 를 in-depth 인터뷰 합니다. 이때 선택된 소스를 활용합니다.
- 인터뷰는 multi-turn conversation 입니다. detailed insights 를 뽑아 내기 위함입니다. STORM 페이퍼를 참조하시면 도움이 될 것입니다.
- 멀티턴 대화로 이뤄진 인터뷰는
sub-graphs
가 될 것입니다. 각 인터뷰가 state를 따로 가지고 있어야겠죠.
Research Process
- 전문가들은 애널리스트의 질문에 답하기 위해 정보를 수집할텐데, 이는
parallel
하게 이뤄집니다. - 모든 인터뷰들은 동시에 일어나고 합쳐지겠죠.
map-reduce
죠.
Output Format
- 각 인터뷰에서 취합된 인사이트들은 최종 보고서로 가공됩니다.
- Output 포맷을 잘 맞춰주기 위해서 프롬프팅을 해줍니다.
%%capture --no-stderr
%pip install --quiet -U langgraph langchain_openai langchain_community langchain_core tavily-python wikipedia
Setup¶
import os, getpass
def _set_env(var: str):
if not os.environ.get(var):
os.environ[var] = getpass.getpass(f"{var}: ")
_set_env("OPENAI_API_KEY")
OPENAI_API_KEY: ··········
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o", temperature=0)
_set_env("LANGCHAIN_API_KEY")
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = "langchain-academy"
LANGCHAIN_API_KEY: ··········
Generate Analysts: Human-In-The-Loop¶
애널리스트들을 생성하고, Human in the loop 으로 멈춥니다.
직접 생성된 애널리스트들을 리뷰합니다.
from typing import List
from typing_extensions import TypedDict
from pydantic import BaseModel, Field
class Analyst(BaseModel):
affiliation: str = Field(
description="Primary affiliation of the analyst.",
)
name: str = Field(
description="Name of the analyst."
)
role: str = Field(
description="Role of the analyst in the context of the topic.",
)
description: str = Field(
description="Description of the analyst focus, concerns, and motives.",
)
@property
def persona(self) -> str:
return f"Name: {self.name}\nRole: {self.role}\nAffiliation: {self.affiliation}\nDescription: {self.description}\n"
class Perspectives(BaseModel):
analysts: List[Analyst] = Field(
description="Comprehensive list of analysts with their roles and affiliations.",
)
class GenerateAnalystsState(TypedDict):
topic: str # Research topic
max_analysts: int # Number of analysts
human_analyst_feedback: str # Human feedback
analysts: List[Analyst] # Analyst asking questions
from IPython.display import Image, display
from langgraph.graph import START, END, StateGraph
from langgraph.checkpoint.memory import MemorySaver
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage
analyst_instructions="""You are tasked with creating a set of AI analyst personas. Follow these instructions carefully:
1. First, review the research topic:
{topic}
2. Examine any editorial feedback that has been optionally provided to guide creation of the analysts:
{human_analyst_feedback}
3. Determine the most interesting themes based upon documents and / or feedback above.
4. Pick the top {max_analysts} themes.
5. Assign one analyst to each theme."""
def create_analysts(state: GenerateAnalystsState):
""" Create analysts """
topic=state['topic']
max_analysts=state['max_analysts']
human_analyst_feedback=state.get('human_analyst_feedback', '')
# Enforce structured output
structured_llm = llm.with_structured_output(Perspectives)
# System message
system_message = analyst_instructions.format(topic=topic,
human_analyst_feedback=human_analyst_feedback,
max_analysts=max_analysts)
# Generate question
analysts = structured_llm.invoke([SystemMessage(content=system_message)]+[HumanMessage(content="Generate the set of analysts.")])
# Write the list of analysis to state
return {"analysts": analysts.analysts}
def human_feedback(state: GenerateAnalystsState):
""" No-op node that should be interrupted on """
pass
def should_continue(state: GenerateAnalystsState):
""" Return the next node to execute """
# Check if human feedback
human_analyst_feedback=state.get('human_analyst_feedback', None)
if human_analyst_feedback:
return "create_analysts"
# Otherwise end
return END
# Add nodes and edges
builder = StateGraph(GenerateAnalystsState)
builder.add_node("create_analysts", create_analysts)
builder.add_node("human_feedback", human_feedback)
builder.add_edge(START, "create_analysts")
builder.add_edge("create_analysts", "human_feedback")
builder.add_conditional_edges("human_feedback", should_continue, ["create_analysts", END])
# Compile
memory = MemorySaver()
graph = builder.compile(interrupt_before=['human_feedback'], checkpointer=memory)
# View
display(Image(graph.get_graph(xray=1).draw_mermaid_png()))
# Input
max_analysts = 3
topic = "Llama3, Llama3.1, Llama3.2 모델들에 대한 분석"
thread = {"configurable": {"thread_id": "1"}}
# Run the graph until the first interruption
for event in graph.stream({"topic":topic,"max_analysts":max_analysts,}, thread, stream_mode="values"):
# Review
analysts = event.get('analysts', '')
if analysts:
for analyst in analysts:
print(f"Name: {analyst.name}")
print(f"Affiliation: {analyst.affiliation}")
print(f"Role: {analyst.role}")
print(f"Description: {analyst.description}")
print("-" * 50)
Name: Dr. Emily Carter Affiliation: OpenAI Role: Model Performance Specialist Description: Dr. Carter focuses on the performance metrics and efficiency of AI models. She is particularly interested in the computational efficiency and accuracy of Llama3 series models. -------------------------------------------------- Name: Prof. Michael Zhang Affiliation: Stanford University Role: Ethics and Bias Researcher Description: Prof. Zhang examines the ethical implications and potential biases in AI models. His work with the Llama3 series involves identifying and mitigating biases in the models' outputs. -------------------------------------------------- Name: Dr. Sarah Lee Affiliation: Google DeepMind Role: Innovation and Applications Expert Description: Dr. Lee explores innovative applications and real-world use cases of AI models. She is interested in how the Llama3 series can be applied in various industries and the potential for new technological advancements. --------------------------------------------------
# Get state and look at next node
state = graph.get_state(thread)
state.next
('human_feedback',)
# We now update the state as if we are the human_feedback node
graph.update_state(thread, {"human_analyst_feedback":
"스타트업의 관점에서 Llama 모델들을 활용할 수 있는 방안에 대해 조사할 수 있는 사람을 추가해줘"}, as_node="human_feedback")
{'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_id': '1ef80803-4512-6d04-8002-c5b0b3792ba9'}}
# Continue the graph execution
for event in graph.stream(None, thread, stream_mode="values"):
# Review
analysts = event.get('analysts', '')
if analysts:
for analyst in analysts:
print(f"Name: {analyst.name}")
print(f"Affiliation: {analyst.affiliation}")
print(f"Role: {analyst.role}")
print(f"Description: {analyst.description}")
print("-" * 50)
Name: Dr. Emily Carter Affiliation: OpenAI Role: Model Performance Specialist Description: Dr. Carter focuses on the performance metrics and efficiency of AI models. She is particularly interested in the computational efficiency and accuracy of Llama3 series models. -------------------------------------------------- Name: Prof. Michael Zhang Affiliation: Stanford University Role: Ethics and Bias Researcher Description: Prof. Zhang examines the ethical implications and potential biases in AI models. His work with the Llama3 series involves identifying and mitigating biases in the models' outputs. -------------------------------------------------- Name: Dr. Sarah Lee Affiliation: Google DeepMind Role: Innovation and Applications Expert Description: Dr. Lee explores innovative applications and real-world use cases of AI models. She is interested in how the Llama3 series can be applied in various industries and the potential for new technological advancements. -------------------------------------------------- Name: Dr. Min-Jae Kim Affiliation: AI Research Lab Role: Lead AI Researcher Description: Dr. Kim focuses on the technical advancements and performance metrics of the Llama3 series models. His primary concern is to evaluate the models' capabilities, limitations, and potential improvements. -------------------------------------------------- Name: Soo-Jin Park Affiliation: Tech Startup Incubator Role: Startup Strategy Consultant Description: Soo-Jin specializes in advising startups on integrating advanced AI models like Llama3 into their business strategies. She explores practical applications, cost-benefit analyses, and market positioning. -------------------------------------------------- Name: Professor Hyeon-Woo Lee Affiliation: University of Seoul Role: AI Ethics and Policy Expert Description: Professor Lee examines the ethical implications and policy considerations of deploying Llama3 models. His focus includes data privacy, algorithmic bias, and regulatory compliance. --------------------------------------------------
# If we are satisfied, then we simply supply no feedback
further_feedack = None
graph.update_state(thread, {"human_analyst_feedback":
further_feedack}, as_node="human_feedback")
{'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_id': '1ef80804-889b-6751-8004-bb0dd29aadcc'}}
# Continue the graph execution to end
for event in graph.stream(None, thread, stream_mode="updates"):
print("--Node--")
node_name = next(iter(event.keys()))
print(node_name)
final_state = graph.get_state(thread)
analysts = final_state.values.get('analysts')
final_state.next
()
for analyst in analysts:
print(f"Name: {analyst.name}")
print(f"Affiliation: {analyst.affiliation}")
print(f"Role: {analyst.role}")
print(f"Description: {analyst.description}")
print("-" * 50)
Name: Dr. Min-Jae Kim Affiliation: AI Research Lab Role: Lead AI Researcher Description: Dr. Kim focuses on the technical advancements and performance metrics of the Llama3 series models. His primary concern is to evaluate the models' capabilities, limitations, and potential improvements. -------------------------------------------------- Name: Soo-Jin Park Affiliation: Tech Startup Incubator Role: Startup Strategy Consultant Description: Soo-Jin specializes in advising startups on integrating advanced AI models like Llama3 into their business strategies. She explores practical applications, cost-benefit analyses, and market positioning. -------------------------------------------------- Name: Professor Hyeon-Woo Lee Affiliation: University of Seoul Role: AI Ethics and Policy Expert Description: Professor Lee examines the ethical implications and policy considerations of deploying Llama3 models. His focus includes data privacy, algorithmic bias, and regulatory compliance. --------------------------------------------------
import operator
from typing import Annotated
from langgraph.graph import MessagesState
class InterviewState(MessagesState):
max_num_turns: int # Number turns of conversation
context: Annotated[list, operator.add] # Source docs
analyst: Analyst # Analyst asking questions
interview: str # Interview transcript
sections: list # Final key we duplicate in outer state for Send() API
class SearchQuery(BaseModel):
search_query: str = Field(None, description="Search query for retrieval.")
question_instructions = """You are an analyst tasked with interviewing an expert to learn about a specific topic.
Your goal is boil down to interesting and specific insights related to your topic.
1. Interesting: Insights that people will find surprising or non-obvious.
2. Specific: Insights that avoid generalities and include specific examples from the expert.
Here is your topic of focus and set of goals: {goals}
Begin by introducing yourself using a name that fits your persona, and then ask your question.
Continue to ask questions to drill down and refine your understanding of the topic.
When you are satisfied with your understanding, complete the interview with: "Thank you so much for your help!"
Remember to stay in character throughout your response, reflecting the persona and goals provided to you."""
def generate_question(state: InterviewState):
""" Node to generate a question """
# Get state
analyst = state["analyst"]
messages = state["messages"]
# Generate question
system_message = question_instructions.format(goals=analyst.persona)
question = llm.invoke([SystemMessage(content=system_message)]+messages)
# Write messages to state
return {"messages": [question]}
답변 생성: Parallel 하게!¶
전문가들은 다양한 소스에서 답변을 위한 정보들을 수집합니다.
- 웹 크롤링 (검색이 아니라 특정 페이지 파싱)
WebBaseLoader
- 미리 정리된 문서들 RAG
- 웹 서치
- 위키피디아 서치
Tavily 같은 다른 웹 서칭 도구를 사용할 수도 있죠.
def _set_env(var: str):
if not os.environ.get(var):
os.environ[var] = getpass.getpass(f"{var}: ")
_set_env("TAVILY_API_KEY")
TAVILY_API_KEY: ··········
# Web search tool
from langchain_community.tools.tavily_search import TavilySearchResults
tavily_search = TavilySearchResults(max_results=3)
# Wikipedia search tool
from langchain_community.document_loaders import WikipediaLoader
위키랑 웹을 검색하는 노드를 만들겠습니다.
애널리스트에 답변을 생성해주는 노드도 만들겠습니다.
전체 인터뷰 내용을 기록하고, 요약해서 섹션으로 넘겨주는 노드도 만들겠습니다.
from langchain_core.messages import get_buffer_string
# Search query writing
search_instructions = SystemMessage(content=f"""You will be given a conversation between an analyst and an expert.
Your goal is to generate a well-structured query for use in retrieval and / or web-search related to the conversation.
First, analyze the full conversation.
Pay particular attention to the final question posed by the analyst.
Convert this final question into a well-structured web search query""")
def search_web(state: InterviewState):
""" Retrieve docs from web search """
# Search query
structured_llm = llm.with_structured_output(SearchQuery)
search_query = structured_llm.invoke([search_instructions]+state['messages'])
# Search
search_docs = tavily_search.invoke(search_query.search_query)
# Format
formatted_search_docs = "\n\n---\n\n".join(
[
f'<Document href="{doc["url"]}"/>\n{doc["content"]}\n</Document>'
for doc in search_docs
]
)
return {"context": [formatted_search_docs]}
def search_wikipedia(state: InterviewState):
""" Retrieve docs from wikipedia """
# Search query
structured_llm = llm.with_structured_output(SearchQuery)
search_query = structured_llm.invoke([search_instructions]+state['messages'])
# Search
search_docs = WikipediaLoader(query=search_query.search_query,
load_max_docs=2).load()
# Format
formatted_search_docs = "\n\n---\n\n".join(
[
f'<Document source="{doc.metadata["source"]}" page="{doc.metadata.get("page", "")}"/>\n{doc.page_content}\n</Document>'
for doc in search_docs
]
)
return {"context": [formatted_search_docs]}
answer_instructions = """You are an expert being interviewed by an analyst.
Here is analyst area of focus: {goals}.
You goal is to answer a question posed by the interviewer.
To answer question, use this context:
{context}
When answering questions, follow these guidelines:
1. Use only the information provided in the context.
2. Do not introduce external information or make assumptions beyond what is explicitly stated in the context.
3. The context contain sources at the topic of each individual document.
4. Include these sources your answer next to any relevant statements. For example, for source # 1 use [1].
5. List your sources in order at the bottom of your answer. [1] Source 1, [2] Source 2, etc
6. If the source is: <Document source="assistant/docs/llama3_1.pdf" page="7"/>' then just list:
[1] assistant/docs/llama3_1.pdf, page 7
And skip the addition of the brackets as well as the Document source preamble in your citation."""
def generate_answer(state: InterviewState):
""" Node to answer a question """
# Get state
analyst = state["analyst"]
messages = state["messages"]
context = state["context"]
# Answer question
system_message = answer_instructions.format(goals=analyst.persona, context=context)
answer = llm.invoke([SystemMessage(content=system_message)]+messages)
# Name the message as coming from the expert
answer.name = "expert"
# Append it to state
return {"messages": [answer]}
def save_interview(state: InterviewState):
""" Save interviews """
# Get messages
messages = state["messages"]
# Convert interview to a string
interview = get_buffer_string(messages)
# Save to interviews key
return {"interview": interview}
def route_messages(state: InterviewState,
name: str = "expert"):
""" Route between question and answer """
# Get messages
messages = state["messages"]
max_num_turns = state.get('max_num_turns',2)
# Check the number of expert answers
num_responses = len(
[m for m in messages if isinstance(m, AIMessage) and m.name == name]
)
# End if expert has answered more than the max turns
if num_responses >= max_num_turns:
return 'save_interview'
# This router is run after each question - answer pair
# Get the last question asked to check if it signals the end of discussion
last_question = messages[-2]
if "Thank you so much for your help" in last_question.content:
return 'save_interview'
return "ask_question"
section_writer_instructions = """You are an expert technical writer.
Your task is to create a short, easily digestible section of a report based on a set of source documents.
1. Analyze the content of the source documents:
- The name of each source document is at the start of the document, with the <Document tag.
2. Create a report structure using markdown formatting:
- Use ## for the section title
- Use ### for sub-section headers
3. Write the report following this structure:
a. Title (## header)
b. Summary (### header)
c. Sources (### header)
4. Make your title engaging based upon the focus area of the analyst:
{focus}
5. For the summary section:
- Set up summary with general background / context related to the focus area of the analyst
- Emphasize what is novel, interesting, or surprising about insights gathered from the interview
- Create a numbered list of source documents, as you use them
- Do not mention the names of interviewers or experts
- Aim for approximately 400 words maximum
- Use numbered sources in your report (e.g., [1], [2]) based on information from source documents
6. In the Sources section:
- Include all sources used in your report
- Provide full links to relevant websites or specific document paths
- Separate each source by a newline. Use two spaces at the end of each line to create a newline in Markdown.
- It will look like:
### Sources
[1] Link or Document name
[2] Link or Document name
7. Be sure to combine sources. For example this is not correct:
[3] https://ai.meta.com/blog/meta-llama-3-1/
[4] https://ai.meta.com/blog/meta-llama-3-1/
There should be no redundant sources. It should simply be:
[3] https://ai.meta.com/blog/meta-llama-3-1/
8. Final review:
- Ensure the report follows the required structure
- Include no preamble before the title of the report
- Check that all guidelines have been followed"""
def write_section(state: InterviewState):
""" Node to answer a question """
# Get state
interview = state["interview"]
context = state["context"]
analyst = state["analyst"]
# Write section using either the gathered source docs from interview (context) or the interview itself (interview)
system_message = section_writer_instructions.format(focus=analyst.description)
section = llm.invoke([SystemMessage(content=system_message)]+[HumanMessage(content=f"Use this source to write your section: {context}")])
# Append it to state
return {"sections": [section.content]}
# Add nodes and edges
interview_builder = StateGraph(InterviewState)
interview_builder.add_node("ask_question", generate_question)
interview_builder.add_node("search_web", search_web)
interview_builder.add_node("search_wikipedia", search_wikipedia)
interview_builder.add_node("answer_question", generate_answer)
interview_builder.add_node("save_interview", save_interview)
interview_builder.add_node("write_section", write_section)
# Flow
interview_builder.add_edge(START, "ask_question")
interview_builder.add_edge("ask_question", "search_web")
interview_builder.add_edge("ask_question", "search_wikipedia")
interview_builder.add_edge("search_web", "answer_question")
interview_builder.add_edge("search_wikipedia", "answer_question")
interview_builder.add_conditional_edges("answer_question", route_messages,['ask_question','save_interview'])
interview_builder.add_edge("save_interview", "write_section")
interview_builder.add_edge("write_section", END)
# Interview
memory = MemorySaver()
interview_graph = interview_builder.compile(checkpointer=memory).with_config(run_name="Conduct Interviews")
# View
display(Image(interview_graph.get_graph().draw_mermaid_png()))
# Pick one analyst
analysts[0]
Analyst(affiliation='AI Research Lab', name='Dr. Min-Jae Kim', role='Lead AI Researcher', description="Dr. Kim focuses on the technical advancements and performance metrics of the Llama3 series models. His primary concern is to evaluate the models' capabilities, limitations, and potential improvements.")
from IPython.display import Markdown
messages = [HumanMessage(f"So you said you were writing an article on {topic}?")]
thread = {"configurable": {"thread_id": "1"}}
interview = interview_graph.invoke({"analyst": analysts[0], "messages": messages, "max_num_turns": 2}, thread)
Markdown(interview['sections'][0])
Evaluating the Technical Advancements and Performance Metrics of Llama3 Series Models¶
Summary¶
The Llama3 series, developed by Meta AI, represents a significant advancement in the field of large language models (LLMs). Positioned as a competitor to OpenAI's GPT series, Llama3 models are designed to excel in various natural language processing (NLP) tasks such as text generation, conversation, and summarization [1]. This report delves into the capabilities, limitations, and potential improvements of the Llama3 series, with a particular focus on the Llama3.1 model.
Llama3.1, the latest iteration, boasts an impressive 405 billion parameters, marking a substantial leap from its predecessors [2]. This model is designed to be versatile and powerful, capable of handling a wide range of tasks with improved performance metrics. Notably, Llama3.1 outperforms Llama3 in several benchmarks, including math tasks where it shows a 14% improvement [3]. This enhancement is attributed to its larger context window and better NLP capabilities, making it a more suitable choice for tasks requiring extensive context [4].
One of the most surprising insights is the performance of the smaller Llama3 8B model, which, despite being ten times smaller than the Llama2 70B, produces similar results [2]. This indicates that Meta AI has made significant strides in optimizing model efficiency without compromising performance.
Key improvements in Llama3.1 include faster performance, broader versatility, and enhanced ease of use. These advancements make it a worthy upgrade for those looking to leverage the power of AI in their projects [5]. However, it's important to note that while Llama3.1 offers numerous benefits, Llama3 still holds its own, particularly in specific use cases where its capabilities are sufficient [5].
In summary, the Llama3 series, particularly the Llama3.1 model, showcases Meta AI's commitment to advancing the field of artificial intelligence. The models' capabilities, combined with their performance metrics, make them strong contenders in the competitive landscape of LLMs.
Sources¶
[1] https://kili-technology.com/large-language-models-llms/llama-3-guide-everything-you-need-to-know-about-meta-s-new-model-and-its-data
[2] https://medium.com/@soumava.dey.aig/decoding-llama-3-a-quick-overview-of-the-model-7e69abcdbe6a
[3] https://medium.com/@getanakin/llama3-vs-llama-comprehensive-review-of-benchmarks-and-pricing-1767fd1bd04a
[4] https://medium.com/@kagglepro/llama-3-vs-llama-3-1-which-is-the-better-fit-for-your-ai-projects-a57a052b89b1
[5] http://anakin.ai/blog/llama3-1-vs-llama-3/
import operator
from typing import List, Annotated
from typing_extensions import TypedDict
class ResearchGraphState(TypedDict):
topic: str # Research topic
max_analysts: int # Number of analysts
human_analyst_feedback: str # Human feedback
analysts: List[Analyst] # Analyst asking questions
sections: Annotated[list, operator.add] # Send() API key
introduction: str # Introduction for the final report
content: str # Content for the final report
conclusion: str # Conclusion for the final report
final_report: str # Final report
from langgraph.constants import Send
def initiate_all_interviews(state: ResearchGraphState):
""" This is the "map" step where we run each interview sub-graph using Send API """
# Check if human feedback
human_analyst_feedback=state.get('human_analyst_feedback')
if human_analyst_feedback:
# Return to create_analysts
return "create_analysts"
# Otherwise kick off interviews in parallel via Send() API
else:
topic = state["topic"]
return [Send("conduct_interview", {"analyst": analyst,
"messages": [HumanMessage(
content=f"So you said you were writing an article on {topic}?"
)
]}) for analyst in state["analysts"]]
report_writer_instructions = """You are a technical writer creating a report on this overall topic:
{topic}
You have a team of analysts. Each analyst has done two things:
1. They conducted an interview with an expert on a specific sub-topic.
2. They write up their finding into a memo.
Your task:
1. You will be given a collection of memos from your analysts.
2. Think carefully about the insights from each memo.
3. Consolidate these into a crisp overall summary that ties together the central ideas from all of the memos.
4. Summarize the central points in each memo into a cohesive single narrative.
To format your report:
1. Use markdown formatting.
2. Include no pre-amble for the report.
3. Use no sub-heading.
4. Start your report with a single title header: ## Insights
5. Do not mention any analyst names in your report.
6. Preserve any citations in the memos, which will be annotated in brackets, for example [1] or [2].
7. Create a final, consolidated list of sources and add to a Sources section with the `## Sources` header.
8. List your sources in order and do not repeat.
[1] Source 1
[2] Source 2
Here are the memos from your analysts to build your report from:
{context}"""
def write_report(state: ResearchGraphState):
# Full set of sections
sections = state["sections"]
topic = state["topic"]
# Concat all sections together
formatted_str_sections = "\n\n".join([f"{section}" for section in sections])
# Summarize the sections into a final report
system_message = report_writer_instructions.format(topic=topic, context=formatted_str_sections)
report = llm.invoke([SystemMessage(content=system_message)]+[HumanMessage(content=f"Write a report based upon these memos.")])
return {"content": report.content}
intro_conclusion_instructions = """You are a technical writer finishing a report on {topic}
You will be given all of the sections of the report.
You job is to write a crisp and compelling introduction or conclusion section.
The user will instruct you whether to write the introduction or conclusion.
Include no pre-amble for either section.
Target around 100 words, crisply previewing (for introduction) or recapping (for conclusion) all of the sections of the report.
Use markdown formatting.
For your introduction, create a compelling title and use the # header for the title.
For your introduction, use ## Introduction as the section header.
For your conclusion, use ## Conclusion as the section header.
Here are the sections to reflect on for writing: {formatted_str_sections}"""
def write_introduction(state: ResearchGraphState):
# Full set of sections
sections = state["sections"]
topic = state["topic"]
# Concat all sections together
formatted_str_sections = "\n\n".join([f"{section}" for section in sections])
# Summarize the sections into a final report
instructions = intro_conclusion_instructions.format(topic=topic, formatted_str_sections=formatted_str_sections)
intro = llm.invoke([instructions]+[HumanMessage(content=f"Write the report introduction")])
return {"introduction": intro.content}
def write_conclusion(state: ResearchGraphState):
# Full set of sections
sections = state["sections"]
topic = state["topic"]
# Concat all sections together
formatted_str_sections = "\n\n".join([f"{section}" for section in sections])
# Summarize the sections into a final report
instructions = intro_conclusion_instructions.format(topic=topic, formatted_str_sections=formatted_str_sections)
conclusion = llm.invoke([instructions]+[HumanMessage(content=f"Write the report conclusion")])
return {"conclusion": conclusion.content}
def finalize_report(state: ResearchGraphState):
""" The is the "reduce" step where we gather all the sections, combine them, and reflect on them to write the intro/conclusion """
# Save full final report
content = state["content"]
if content.startswith("## Insights"):
content = content.strip("## Insights")
if "## Sources" in content:
try:
content, sources = content.split("\n## Sources\n")
except:
sources = None
else:
sources = None
final_report = state["introduction"] + "\n\n---\n\n" + content + "\n\n---\n\n" + state["conclusion"]
if sources is not None:
final_report += "\n\n## Sources\n" + sources
return {"final_report": final_report}
# Add nodes and edges
builder = StateGraph(ResearchGraphState)
builder.add_node("create_analysts", create_analysts)
builder.add_node("human_feedback", human_feedback)
builder.add_node("conduct_interview", interview_builder.compile())
builder.add_node("write_report",write_report)
builder.add_node("write_introduction",write_introduction)
builder.add_node("write_conclusion",write_conclusion)
builder.add_node("finalize_report",finalize_report)
# Logic
builder.add_edge(START, "create_analysts")
builder.add_edge("create_analysts", "human_feedback")
builder.add_conditional_edges("human_feedback", initiate_all_interviews, ["create_analysts", "conduct_interview"])
builder.add_edge("conduct_interview", "write_report")
builder.add_edge("conduct_interview", "write_introduction")
builder.add_edge("conduct_interview", "write_conclusion")
builder.add_edge(["write_conclusion", "write_report", "write_introduction"], "finalize_report")
builder.add_edge("finalize_report", END)
# Compile
memory = MemorySaver()
graph = builder.compile(interrupt_before=['human_feedback'], checkpointer=memory)
display(Image(graph.get_graph(xray=1).draw_mermaid_png()))
Let's ask an open-ended question about LangGraph.
# Inputs
max_analysts = 3
topic = "Llama3, Llama3.1, Llama3.2 모델들에 대한 분석"
thread = {"configurable": {"thread_id": "1"}}
# Run the graph until the first interruption
for event in graph.stream({"topic":topic,
"max_analysts":max_analysts},
thread,
stream_mode="values"):
analysts = event.get('analysts', '')
if analysts:
for analyst in analysts:
print(f"Name: {analyst.name}")
print(f"Affiliation: {analyst.affiliation}")
print(f"Role: {analyst.role}")
print(f"Description: {analyst.description}")
print("-" * 50)
Name: Dr. Emily Carter Affiliation: Tech Innovators Inc. Role: Technology Adoption Specialist Description: Dr. Carter focuses on the strategic benefits and challenges of adopting new technologies in enterprise environments. She is particularly interested in how LangGraph can streamline operations and improve efficiency. -------------------------------------------------- Name: Raj Patel Affiliation: Data Security Solutions Role: Cybersecurity Analyst Description: Raj Patel examines the security implications of adopting new frameworks like LangGraph. His primary concern is ensuring that the integration of LangGraph does not introduce vulnerabilities and that it enhances overall system security. -------------------------------------------------- Name: Dr. Maria Gonzalez Affiliation: AI Research Lab Role: AI Ethics Researcher Description: Dr. Gonzalez explores the ethical considerations of implementing AI frameworks such as LangGraph. She is focused on ensuring that the adoption of LangGraph aligns with ethical standards and promotes fair and unbiased AI practices. -------------------------------------------------- Name: Dr. Emily Carter Affiliation: OpenAI Role: Model Performance Specialist Description: Dr. Carter focuses on the performance metrics and benchmarks of AI models. She is particularly interested in the comparative analysis of Llama3, Llama3.1, and Llama3.2, examining their efficiency, accuracy, and scalability. -------------------------------------------------- Name: Prof. John Smith Affiliation: MIT Role: Ethics and Bias Analyst Description: Prof. Smith's research centers on the ethical implications and potential biases in AI models. He will analyze Llama3, Llama3.1, and Llama3.2 for any inherent biases and ethical concerns, providing insights into their societal impacts. -------------------------------------------------- Name: Dr. Alice Wong Affiliation: Google Research Role: Innovation and Development Expert Description: Dr. Wong is an expert in AI innovation and development. She will explore the technological advancements and novel features introduced in Llama3, Llama3.1, and Llama3.2, highlighting their contributions to the field of AI. --------------------------------------------------
# We now update the state as if we are the human_feedback node
graph.update_state(thread, {"human_analyst_feedback":
"Add in the CEO of gen ai native startup"}, as_node="human_feedback")
{'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_id': '1ef8083c-f643-6809-8005-779be80ccc9d'}}
# Check
for event in graph.stream(None, thread, stream_mode="values"):
analysts = event.get('analysts', '')
if analysts:
for analyst in analysts:
print(f"Name: {analyst.name}")
print(f"Affiliation: {analyst.affiliation}")
print(f"Role: {analyst.role}")
print(f"Description: {analyst.description}")
print("-" * 50)
Name: Dr. Emily Carter Affiliation: OpenAI Role: Model Performance Specialist Description: Dr. Carter focuses on the performance metrics and benchmarks of AI models. She is particularly interested in the comparative analysis of Llama3, Llama3.1, and Llama3.2, examining their efficiency, accuracy, and scalability. -------------------------------------------------- Name: Prof. John Smith Affiliation: MIT Role: Ethics and Bias Analyst Description: Prof. Smith's research centers on the ethical implications and potential biases in AI models. He will analyze Llama3, Llama3.1, and Llama3.2 for any inherent biases and ethical concerns, providing insights into their societal impacts. -------------------------------------------------- Name: Dr. Alice Wong Affiliation: Google Research Role: Innovation and Development Expert Description: Dr. Wong is an expert in AI innovation and development. She will explore the technological advancements and novel features introduced in Llama3, Llama3.1, and Llama3.2, highlighting their contributions to the field of AI. -------------------------------------------------- Name: Alex Kim Affiliation: Gen AI Native Startup Role: CEO Description: Alex is the CEO of a startup that specializes in generative AI technologies. He is focused on the commercial applications and market potential of the Llama3 series models. His primary concern is how these models can be leveraged to create innovative products and services that can disrupt existing markets. -------------------------------------------------- Name: Dr. Emily Park Affiliation: AI Research Institute Role: Lead Research Scientist Description: Dr. Park is a lead research scientist at a prominent AI research institute. Her focus is on the technical advancements and innovations in the Llama3 series models. She is particularly interested in the architectural improvements and performance metrics of Llama3, Llama3.1, and Llama3.2. -------------------------------------------------- Name: John Lee Affiliation: Tech Media Role: Tech Journalist Description: John is a seasoned tech journalist who writes for a leading technology media outlet. His focus is on the broader implications of the Llama3 series models in the tech industry. He is interested in how these models compare to other state-of-the-art AI models and their potential impact on various sectors. --------------------------------------------------
# Confirm we are happy
graph.update_state(thread, {"human_analyst_feedback":
None}, as_node="human_feedback")
{'configurable': {'thread_id': '1', 'checkpoint_ns': '', 'checkpoint_id': '1ef8083d-d45d-6946-8007-0b1d6cf236ee'}}
# Continue
for event in graph.stream(None, thread, stream_mode="updates"):
print("--Node--")
node_name = next(iter(event.keys()))
print(node_name)
--Node-- conduct_interview --Node-- conduct_interview --Node-- conduct_interview --Node-- write_conclusion --Node-- write_introduction --Node-- write_report --Node-- finalize_report
from IPython.display import Markdown
final_state = graph.get_state(thread)
report = final_state.values.get('final_report')
Markdown(report)
Leveraging Llama 3 for Market Disruption: Insights for Generative AI Applications¶
Introduction¶
The Llama 3 series, developed by Meta, marks a significant leap in generative AI, offering enhanced performance and new capabilities for commercial applications. This report explores how these models can be harnessed to create innovative products and services that disrupt existing markets. We delve into the architectural advancements and performance metrics of Llama 3, Llama 3.1, and Llama 3.2, highlighting their superior language processing abilities. Additionally, we examine the broader implications of these models in the tech industry, including their state-of-the-art capabilities, multilingual support, and optimization for edge devices.
The Llama 3 series, developed by Meta, marks a significant leap in the field of generative AI, offering enhanced performance and new capabilities that can be leveraged for commercial applications. These models, including Llama 3, Llama 3.1, and Llama 3.2, are designed to rival top AI models in various capabilities, such as general knowledge, steerability, math, tool use, and multilingual translation.
Llama 3 is a text-generation AI model similar to OpenAI's GPT and Anthropic's Claude models, generating text responses based on given prompts with notable improvements in contextual understanding and logical reasoning [1]. The series includes models with 8 billion and 70 billion parameters, designed to enhance processing power, versatility, and accessibility [2]. Key improvements in Llama 3 include the use of a tokenizer with a vocabulary of 128K tokens, which encodes language more efficiently, leading to substantially improved model performance. Additionally, the adoption of grouped query attention (GQA) across both the 8B and 70B sizes has improved inference efficiency [3].
One of the most groundbreaking aspects of Llama 3 is its open-source nature, which democratizes access to advanced AI capabilities and fosters innovation. The release of the 405B model, Llama 3.1, further pushes the boundaries by offering state-of-the-art capabilities in general knowledge, steerability, math, tool use, and multilingual translation [4]. This model is poised to supercharge innovation, providing unprecedented opportunities for growth and exploration [5].
Llama 3 introduces several key improvements over its predecessor, Llama 2, starting with its architecture. While maintaining a relatively standard decoder-only transformer architecture, Llama 3 incorporates a tokenizer with a vocabulary of 128K tokens, which encodes language more efficiently and leads to substantially improved model performance [1]. The training data for Llama 3 has also seen a significant increase, with a pretraining corpus expanded by 650%, providing a much richer dataset for the model to learn from [2]. This expansion in training data contributes to Llama 3's enhanced understanding and generation of language.
In real-world applications, Llama 3 has demonstrated notable improvements in the speed and accuracy of language tasks. These enhancements are not just theoretical but have been observed in practical scenarios, showcasing Llama 3's superior performance in handling complex language tasks [3]. However, it is worth noting that despite these advancements, Llama 3's performance in competitive benchmarks has shown variability, with win rates dropping from a high 50% to a low 40% in some cases [4]. Llama 3 also excels in multi-step tasks due to refined post-training processes that minimize false rejections, improve response alignment, and generate more diverse answers, making it more robust and versatile in handling a variety of language tasks compared to Llama 2 [5].
The Llama 3 series models are designed to support a wide range of languages and tasks, including coding, reasoning, and tool usage, making them highly versatile and applicable across various industries, from healthcare to finance [2][3]. The Llama 3.2 models, with 1B and 3B parameters, are optimized for on-device use cases, supporting a context length of 128K tokens, making them ideal for tasks such as summarization, instruction following, and rewriting on mobile and edge devices. They are also optimized for Qualcomm and MediaTek hardware, ensuring broad compatibility and efficient performance [3].
The development of LLMs has seen significant advancements since the introduction of the transformer architecture in 2017. The Llama 3 series builds on this foundation, offering models that are not only powerful but also openly available, which contrasts with the more restricted access of models like GPT-3 and GPT-4 [4]. By fine-tuning the Llama 3 models on data specific to particular industries, it is possible to create custom AI solutions that address unique challenges. For instance, in healthcare, these models can assist in medical image analysis, while in finance, they can be used for predictive analytics and risk assessment [5][6].
Meta has emphasized the importance of safety in the development of the Llama 3.1 models. This focus on safety is crucial as these models are deployed in various sensitive applications, ensuring that they operate within ethical guidelines and minimize potential risks [5]. The Llama 3 series models are poised to have a significant impact on the tech industry, offering advanced capabilities and broad applicability across multiple sectors. Their open availability and optimization for edge devices further enhance their potential to drive innovation and address complex challenges in various domains.
Conclusion¶
The Llama 3 series, developed by Meta, marks a significant leap in generative AI, offering enhanced performance and new capabilities that can disrupt existing markets. This report has explored the potential of Llama 3 for market disruption, highlighting its open-source nature and advanced features. We delved into the architectural improvements and performance metrics of Llama3, Llama3.1, and Llama3.2, noting substantial advancements over Llama2. Additionally, we examined the broader implications of these models in the tech industry, emphasizing their state-of-the-art capabilities, multilingual support, and optimization for edge devices. The Llama 3 series stands poised to drive innovation and address complex challenges across various sectors.
Sources¶
[1] https://www.datacamp.com/blog/meta-announces-llama-3-the-next-generation-of-open-source-llms
[2] https://techcrunch.com/2024/04/18/meta-releases-llama-3-claims-its-among-the-best-open-models-available/
[3] https://ai.meta.com/blog/meta-llama-3/
[4] https://builtin.com/articles/llama-3
[5] https://ai.meta.com/blog/meta-llama-3-1/
[6] https://www.unite.ai/everything-you-need-to-know-about-llama-3-most-powerful-open-source-model-yet-concepts-to-usage/
[7] https://kanerika.com/blogs/llama-3-vs-llama-2/
[8] https://www.linkedin.com/pulse/comprehensive-technical-analysis-llama-3-comparison-2-ibad-rehman-kw8pe
[9] https://lmsys.org/blog/2024-05-08-llama3/
[10] https://blog.monsterapi.ai/blogs/what-is-llama-3-and-how-it-differs-from-llama-2/
[11] https://arxiv.org/abs/2407.21783
[12] https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/
[13] https://en.wikipedia.org/wiki/Large_language_model
[14] https://www.datacamp.com/blog/llama-3-1-405b-meta-ai
[15] https://www.indikaai.com/blog/the-llama-3-herd-of-models-a-revolution-in-multilingual-ai-and-beyond
3번의 인터뷰 (conduct_interview) 와 report 를 write 하는 노드들 까지 각각 잘 수행된 것을 볼 수 있습니다.
작성된 결과물 report가 잘 만들어졌냐. 고 하면 애매합니다.
- llama3 과 llama3.1 에 대한 좋은 정보가 잘 담겼습니다.
- 원 논문인 herd of models 를 보지는 않았지만, 그래도 재해석된 블로그글을 참조했습니다.
- GQA 라던가 405B 모델 등 기타 주요 사항들은 잘 담겼씁니다.
- 문제는 llama3.2 에 대한 정보입니다.
- 아주 작은 모델이 포함되었다는점, cross-attention을 vision input 추가에 활용했다는 점이 전혀 담기지 않았습니다.
문제 원인을 분석해보자면, 제가 이 코드를 수행하는 현 시점 (24.10.02) 기준 llama3.2 는 나온지 7일된 아주 새로운 모델입니다. 해당 모델을 재해석한 popular 한 글들인 웹상에 존재하지 않아서 검색이 안 된 것으로 추정이 됩니다.... ㅜㅠ 이를 해결해야겠죠?
LangGraph Studio 로 시도하기¶
위 그래프 코드를 py 코드로 만들어서 studio 프로젝트로 넣어두면 아래와 같이 UI 와 함께 사용할 수 있습니다.
py code 는 https://github.com/langchain-ai/langchain-academy/tree/main/module-4/studio 를 참조하시고요.
Llama3.2 에 대한 자료가 아쉬워서 다시 조사를 시켜봤습니다.
그랬더니 Llama3.2 에 대해 결과를 내 뱉었고, 공식 meta의 Llama3.2 자료를 잘 참조했습니다.
그러나, Llama 3 과 3.2 를 제대로 구분하지 못해서 내용이 어지럽고 틀린 소리가 많아요. 탈락입니다.
search 단계에 human in the loop 을 넣어서 필요 없는 자료들을 처내주거나 하는 방안을 추가해야할 것 같습니다.
결과 참조 링크는 여기 있습니다.
결론
좋은 research를 구현할 수 있었으나, 완벽하지 않다. 이를 해결하기 위해 다음 방법을 찾아가 보겠습니다.
다음 주제는 Reflection 입니다.