连接到特征库

特征库是传统机器学习中的一个概念，可确保输入模型的数据是最新且相关的。有关这方面的更多信息，请参阅此处。

在考虑将 LLM 应用程序投入生产时，这一概念极为重要。为了个性化 LLM 应用程序，您可能希望将 LLM 与特定用户的最新信息结合起来。特征库是保持数据新鲜度的绝佳方式，而 LangChain 则提供了将这些数据与 LLM 结合起来的简便方法。

在本笔记本中，我们将展示如何将提示模板连接到特征库。其基本思想是在提示模板中调用特征库来检索值，然后将这些值格式化到提示中。

盛宴

首先，我们将使用流行的开源特征库框架 Feast。

我们假设你已经执行了 README 中的入门步骤。我们将以入门中的示例为基础，创建 LLMChain，向特定驱动程序写入有关其最新统计数据的说明。

加载盛宴存储

同样，应根据 Feast README 中的说明进行设置

from feast import FeatureStore

# You may need to update the path depending on where you stored it

feast_repo_path = "../../../../../my_feature_repo/feature_repo/"

store = FeatureStore(repo_path=feast_repo_path)

提示

在这里，我们将设置一个自定义的 FeastPromptTemplate。该提示模板将输入一个驾驶员 ID，查找他们的统计数据，并将这些统计数据格式化为一个提示。

请注意，该提示模板的输入只有 driver_id，因为这是唯一一个用户定义的变量（所有其他变量都在提示模板内查找）。

from langchain.prompts import PromptTemplate, StringPromptTemplate

template = """Given the driver's up to date stats, write them note relaying those stats to them.

If they have a conversation rate above .5, give them a compliment. Otherwise, make a silly joke about chickens at the end to make them feel better

Here are the drivers stats:

Conversation rate: {conv_rate}

Acceptance rate: {acc_rate}

Average Daily Trips: {avg_daily_trips}

Your response:"""

prompt = PromptTemplate.from_template(template)

class FeastPromptTemplate(StringPromptTemplate):

def format(self, **kwargs) -> str:

driver_id = kwargs.pop("driver_id")

feature_vector = store.get_online_features(

features=[

"driver_hourly_stats:conv_rate",

"driver_hourly_stats:acc_rate",

"driver_hourly_stats:avg_daily_trips",

entity_rows=[{"driver_id": driver_id}],

).to_dict()

kwargs["conv_rate"] = feature_vector["conv_rate"][0]

kwargs["acc_rate"] = feature_vector["acc_rate"][0]

kwargs["avg_daily_trips"] = feature_vector["avg_daily_trips"][0]

return prompt.format(**kwargs)

prompt_template = FeastPromptTemplate(input_variables=["driver_id"])

print(prompt_template.format(driver_id=1001))

Given the driver's up to date stats, write them note relaying those stats to them.

If they have a conversation rate above .5, give them a compliment. Otherwise, make a silly joke about chickens at the end to make them feel better

Here are the drivers stats:

Conversation rate: 0.4745151400566101

Acceptance rate: 0.055561766028404236

Average Daily Trips: 936

Your response:

在链中使用

现在我们可以在链中使用它，成功创建一个由功能存储支持的个性化链

from langchain.chat_models import ChatOpenAI

from langchain.chains import LLMChain

chain = LLMChain(llm=ChatOpenAI(), prompt=prompt_template)

chain.run(1001)

"Hi there! I wanted to update you on your current stats. Your acceptance rate is 0.055561766028404236 and your average daily trips are 936. While your conversation rate is currently 0.4745151400566101, I have no doubt that with a little extra effort, you'll be able to exceed that .5 mark! Keep up the great work! And remember, even chickens can't always cross the road, but they still give it their best shot."

Tecton

上面，我们展示了如何将流行的开源自管理功能商店 Feast 与 LangChain 结合使用。下面的示例将展示使用 Tecton 进行类似的集成。Tecton 是一个完全托管的功能平台，用于协调从转换到在线服务的整个 ML 功能生命周期，并提供企业级 SLA。

前提条件

Tecton 部署（在 https://tecton.ai 注册）

TECTON_API_KEY 环境变量设置为有效的服务账户密钥

定义和加载功能

我们将使用 Tecton 教程中的 user_transaction_counts 功能视图作为功能服务的一部分。为简单起见，我们只使用一个特征视图；不过，更复杂的应用程序可能需要更多特征视图来检索提示所需的特征。

user_transaction_metrics = FeatureService(

name = "user_transaction_metrics",

features = [user_transaction_counts]

)

上述功能服务预计将应用于实时工作区。在本例中，我们将使用 "prod "工作区。

import tecton

workspace = tecton.get_workspace("prod")

feature_service = workspace.get_feature_service("user_transaction_metrics")

提示

在这里，我们将设置一个自定义的 TectonPromptTemplate。该提示模板将接收一个用户 ID，查找他们的统计数据，并将这些统计数据格式化为一个提示。

请注意，这个提示模板的输入只有 user_id，因为这是唯一一个用户定义的变量（所有其他变量都在提示模板内查找）。

from langchain.prompts import PromptTemplate, StringPromptTemplate

template = """Given the vendor's up to date transaction stats, write them a note based on the following rules:

1. If they had a transaction in the last day, write a short congratulations message on their recent sales

2. If no transaction in the last day, but they had a transaction in the last 30 days, playfully encourage them to sell more.

3. Always add a silly joke about chickens at the end

Here are the vendor's stats:

Number of Transactions Last Day: {transaction_count_1d}

Number of Transactions Last 30 Days: {transaction_count_30d}

Your response:"""

prompt = PromptTemplate.from_template(template)

class TectonPromptTemplate(StringPromptTemplate):

def format(self, **kwargs) -> str:

user_id = kwargs.pop("user_id")

feature_vector = feature_service.get_online_features(

join_keys={"user_id": user_id}

).to_dict()

kwargs["transaction_count_1d"] = feature_vector[

"user_transaction_counts.transaction_count_1d_1d"

]

kwargs["transaction_count_30d"] = feature_vector[

"user_transaction_counts.transaction_count_30d_1d"

]

return prompt.format(**kwargs)

prompt_template = TectonPromptTemplate(input_variables=["user_id"])

print(prompt_template.format(user_id="user_469998441571"))

Given the vendor's up to date transaction stats, write them a note based on the following rules:

1. If they had a transaction in the last day, write a short congratulations message on their recent sales

2. If no transaction in the last day, but they had a transaction in the last 30 days, playfully encourage them to sell more.

3. Always add a silly joke about chickens at the end

Here are the vendor's stats:

Number of Transactions Last Day: 657

Number of Transactions Last 30 Days: 20326

Your response:

在连锁店中使用

现在我们可以在连锁店中使用它，成功创建一个连锁店，在 Tecton 功能平台的支持下实现个性化定制

from langchain.chat_models import ChatOpenAI

from langchain.chains import LLMChain

chain = LLMChain(llm=ChatOpenAI(), prompt=prompt_template)

chain.run("user_469998441571")

'Wow, congratulations on your recent sales! Your business is really soaring like a chicken on a hot air balloon! Keep up the great work!'

功能表单

最后，我们将使用开源的企业级特征存储 Featureform 来运行相同的示例。Featureform 允许您与 Spark 或本地等基础架构协同工作，以定义特征转换。

初始化 Featureform

您可以按照 README 中的说明在 Featureform 中初始化转换和特征。

import featureform as ff

client = ff.Client(host="demo.featureform.com")

提示

在这里，我们将设置一个自定义功能表单提示模板（FeatureformPromptTemplate）。该提示模板将输入用户每次交易的平均支付金额。

请注意，该提示模板的输入仅为 avg_transaction，因为这是唯一由用户定义的变量（所有其他变量均在提示模板内查找）。

from langchain.prompts import PromptTemplate, StringPromptTemplate

template = """Given the amount a user spends on average per transaction, let them know if they are a high roller. Otherwise, make a silly joke about chickens at the end to make them feel better

Here are the user's stats:

Average Amount per Transaction: ${avg_transcation}

Your response:"""

prompt = PromptTemplate.from_template(template)

class FeatureformPromptTemplate(StringPromptTemplate):

def format(self, **kwargs) -> str:

user_id = kwargs.pop("user_id")

fpf = client.features([("avg_transactions", "quickstart")], {"user": user_id})

return prompt.format(**kwargs)

prompt_template = FeatureformPrompTemplate(input_variables=["user_id"])

print(prompt_template.format(user_id="C1410926"))

在链中使用

现在我们可以在一个链中使用它，成功创建一个由 Featureform 功能平台支持的实现个性化的链

from langchain.chat_models import ChatOpenAI

from langchain.chains import LLMChain

chain = LLMChain(llm=ChatOpenAI(), prompt=prompt_template)

chain.run("C1410926")

EonYun

Search This Blog

连接到特征库

Comments

Post a Comment

Popular posts from this blog

Think in 2B SaaS

MAC Homebrew安装了zookeeper 但是stop时报错

FAISS Vector DB 学习笔记（一）