URLs & PDFs

Pass in a URL somewhere in the prompt. This will get crawled and scraped. You can also specify page tags/elements for better accuracy. Consider this prompt:

user_content='https://www.houzz.com/professionals/landscape-contractors/southern-turf-co-nashville-pfvwus-pf~382005468  {contractor name, phone_number, website_url}, website_url should be the pro/contractor/business site. Targeting <main>'

This is likely to be faster and more accurate than simply relying on the full page scrape.

Additionally, max_tokens requires special treatment: for URLs, PDFs, YouTube links, etc max_tokens dictates TOTAL tokens including all scraped content plus the final output. This will be an approximate token boundary, but you must pass at least 6000 + desired response max to allow for crawled data.

You may require a larger max token count to ensure adequate crawled input data depending on desired response.

Due to built in timeouts, super long-running requests (typically in the >30-40k range) are handled best via HTTP request, not an OpenAI package (but the endpoint schema etc is the same).

Python Vision