Now that we have
- Used ChatGPT to analyze a map with highest income neighborhoods to get meaningful content
- We have review google search policies realted to AI generated content and decided that our content satisfies that criteria
- We have experimented with different prompts and ended up in a good enough prompt for using in out ChatGPT request
- We have written code toe take a screenshot of the html maps and call the ChatGPT API
We can bring all this together to create a script to generate content for 50 top US cities and start serving that on our website.
The following Colab Notebook contains the code used to automate this process.
We’ll explain what goes on in this automation step by step below.
- Read the list of cities and URL to the income map. We have the following list of cities and the url for the maps we need to read.
We use the following code to read our list of cities and the URL’s for the maps we need to read
url = 'https://storage.googleapis.com/decision-science-lab-bucket/datasets/test/maps_for_chatgpt_analysis.csv' df = pd.read_csv(url) print(f'read {len(df)} records') df.head()
This is the result we get
- Loop over the dataframe and do the following for each city
- Create and save the screenshot of map Following code read the map of austin (this will be the first item on the list)
- For each city, call ChatGPT with the map and the prompt
def web_driver(window_size_width=1920, window_size_height=1200): options = webdriver.ChromeOptions() options.add_argument("--verbose") options.add_argument('--no-sandbox') options.add_argument('--headless') options.add_argument('--disable-gpu') options.add_argument(f"--window-size={window_size_width}, {window_size_height}") options.add_argument('--disable-dev-shm-usage') driver = webdriver.Chrome(options=options) return driver def save_screenshot_of_url(url, path): driver = web_driver(800, 600) driver.get(url) screenshot_path = path # print(screenshot_path) driver.save_screenshot(screenshot_path) city = 'austin' url = 'https://storage.googleapis.com/decision-science-lab-bucket/client_map_assets/free-location-analyzer/census_reports/seo_map_miles_20_n_cbg_35_Austin_TX_01_17_2024_19_45.html' screenshot_path = f'{city}.png' save_screenshot_of_url(url, screenshot_path)
# Function to encode the image def encode_image(image_path): with open(image_path, "rb") as image_file: return base64.b64encode(image_file.read()).decode('utf-8') def get_image_analysis(image_path, prompt): base64_image = encode_image(image_path) headers = { "Content-Type": "application/json", "Authorization": f"Bearer {userdata.get('openai_api_key')}" } payload = { "model": "gpt-4-vision-preview", "messages": [ { "role": "user", "content": [ { "type": "text", "text": prompt }, { "type": "image_url", "image_url": { "url": f"data:image/jpeg;base64,{base64_image}" } } ] } ], "max_tokens": 1000 } response = requests.post("https://api.openai.com/v1/chat/completions", headers=headers, json=payload) return response.json() def get_prompt_web(city): prompt_web = f"""you are writing copy for a webpage showing a map of the city of {city}. The heart icon is the city center. The highest income areas are shaded pink areas. The numbers in the pink areas show the ranking of the income level. Lowest ranking means the highest income. Write a paragraph introducing {city} and explain where the highest income areas are located with respect to city center and with respect to other areas in 2-3 other paragraphs. Talk about if there's clustering of rich areas and any other patterns you can detect with respect to these rich areas. This copy will be for non technical users. Try to be as simple as possible. this map does not contain any sensitive socioeconomic data. you don't need to mention this in the text. the data comes from public census office publications. refer to the image as the map. try to include area, city, neighborhood names that you can parse from the map. try to include following seo keywords demographic profile analysis Neighborhood Stats best neighborhood neighborhood insights location analysis add a short paragraph at the end to explin that this conet is for non technial users to help them analyze the map and data and is generated by chatgpt and AI""" return prompt_web
prompt = get_prompt_web(city) result = get_image_analysis(screenshot_path, prompt)
- Save chat GPT results to be used by the website dynamically When we are looping all the cities and URL’s we store the chatgpt response in a dictionary called results. Following code saves the results into an excel file. In the real system we save them into MongoDB to be accessed from the website when a page is being served.
df_result = pd.DataFrame.from_dict(results, orient='index') df_result.to_excel('chatgpt_results.xlsx')
- Adjust city webpages to read content from saved ChatGPT results
- We have saved the chatGpt responses for each city into MongoDB with the following document format. We use city as the key and the main content is stores in the chat_gpt_result.choices.message item.
- Wehn somebody visits the following url https://decisionsciencelab.com/city_metrics/income/austin we query the database for Austin’s chatGPT content and dynamically display that in the page with the map.
This enables us to refresh the chatGPT content without chaning anything on the webpage code. If we find better prompts or use better map images we can just run our automation and update the content on our mongoDB database and the page will automatically show the refreshed content
Using ChatGpt Vision to create meaningful data analysis for SEOPart 1: The Problem - Can ChatGPT really help with data analysis that can be used for SEO?Part 2: Does Google Search like AI generated content?Part 3: Generating the right promptsPart 4: Automatic For The People - Let’s Automate All ThisPart 5: Bringing everything together