{"text":"Imagine the wildest idea that you've ever had, and you're curious about how it might scale to something that's a 100,a 1,000times bigger.....}
配置 OPENAI_API_KEY
查看 class OpenAI(SyncAPIClient) 类实现的源码片段发现,关于 api_key和 base_url会读取本地环境变量中 OPENAI_API_KEY和 OPENAI_BASE_URL变量。应该转换为音频的文本和用于音频生成的语音。我们提供了两种不同的模型变量,ts-1针对实时文本到语音的用例进行了优化,而ts-1-hd针对质量进行了优化。DALL·e2还支持编辑现有映像,或创建用户提供的映像的变体。
fromopenai importOpenAIclient =OpenAI()response =client.chat.completions.create(model="gpt-4o",messages=[{"role":"user","content":[{"type":"text","text":"What’s in this image?"},{"type":"image_url","image_url":{"url":"https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",},},],}],max_tokens=300,)print(response.choices[0])
importbase64importrequestsapi_key ="YOUR_OPENAI_API_KEY"defencode_image(image_path):withopen(image_path,"rb")asimage_file:returnbase64.b64encode(image_file.read()).decode('utf-8')image_path ="path_to_your_image.jpg"base64_image =encode_image(image_path)headers ={"Content-Type":"application/json","Authorization":f"Bearer {api_key}"}payload ={"model":"gpt-4o","messages":[{"role":"user","content":[{"type":"text","text":"What’s in this image?"},{"type":"image_url","image_url":{"url":f"data:image/jpeg;base64,{base64_image}"}}]}],"max_tokens":300}response =requests.post("https://api.openai.com/v1/chat/completions",headers=headers,json=payload)print(response.json())
这些模型可以与Audio API中的Speech端点一起使用。
ifapi_key isNone:api_key =os.environ.get("OPENAI_API_KEY")ifapi_key isNone:raiseOpenAIError("The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable")self.api_key =api_key
fromopenai importOpenAIclient =OpenAI()response =client.chat.completions.create(model="gpt-4",messages=[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Who won the world series in 2020?"},{"role":"assistant","content":"The Los Angeles Dodgers won the World Series in 2020."},{"role":"user","content":"Where was it played?"}])print(response)
数据结构
{"choices":[{"finish_reason":"stop","index":0,"message":{"content":"The 2020 World Series was played in Texas at Globe Life Field in Arlington.","role":"assistant"},"logprobs":null}],"created":1677664795,"id":"chatcmpl-7QyqpwdfhqwajicIEznoc6Q47XAyW","model":"gpt-3.5-turbo-0613","object":"chat.completion","usage":{"completion_tokens":17,"prompt_tokens":57,"total_tokens":74}}
目前,Whisper的开源版本和通过我们的API提供的版本之间没有区别。mp4、标准质量的图像是最快生成的。它是在不同音频的大型数据集上训练的,也是一个多任务模型,可以执行多语言语音识别以及语音翻译和语言识别。mpeg、您可以使用DALL·e3一次请求1个图像(通过并行请求请求更多),或者使用带n参数的DALL·e2一次至多请求10个图像。安装时,请确保勾选 “Add Python to PATH” (添加环境变量)选项,以便在 cmd 命令行中直接使用 Python。这通常通过 from dotenv import load_dotenv和 load_dotenv()函数实现。与gpt-3.5 turbo一样,GPT-4针对聊天功能进行了优化,但在使用聊天完井API的传统完井任务中表现良好。简单的请求如下所示:
frompathlib importPathfromopenai importOpenAIclient =OpenAI()speech_file_path =Path(__file__).parent /"speech.mp3"response =client.audio.speech.create(model="tts-1",voice="alloy",input="Today is a wonderful day to build something people love!")response.stream_to_file(speech_file_path)
音色选择
尝试不同的声音(alloy, echo, fable, onyx, nova, and shimmer),找到一个符合你想要的语气和听众。