Oct 28, 2025·8 min read

Real estate listings app with AI tools: imports and photos

Q: What should I use as the unique ID so imports don’t create duplicates?

A safe default is `source_feed + source_listing_id` as the unique key, enforced in your database. If a feed doesn’t provide a stable ID, use a normalized address plus unit, then add one or two stabilizers like postal code and list price so minor formatting changes don’t create new records.

Build a real estate listings app with AI tools by planning imports, handling duplicates, and optimizing photos so pages load fast and data stays clean.

Why listings apps get slow and messy

When a real estate site gets “slow,” it usually looks like this: search results take seconds, filters lag or reset, listing pages load with blank photo boxes, and the same property appears two or three times with slightly different addresses. Users don’t wait. They bounce. Agents stop trusting the system.

The root cause is rarely one big bug. It’s usually a few quiet problems that grow with every import.

Imports are the first one. Early on, it feels fine to “just load the CSV” or pull an MLS feed and clean it up later. But imports create a lot of rows fast. One mismatch (address formats, missing IDs, inconsistent dates) turns into messy data your app has to fight on every page view.

Duplicates are the second. The same listing arrives from multiple sources, a scheduled job runs twice, or address formatting changes slightly. If you don’t decide what counts as “the same property,” duplicates spread into favorites, leads, and analytics. Cleaning them up later becomes risky.

Photos are the third. Images are often the heaviest part of the page. Serving full-size photos, generating too many variants, or loading every image at once will slow down even a simple listing page.

Success looks boring (in the best way): search feels instant, every property has one canonical record across imports, and photos upload reliably without making pages heavy.

This guide is for founders and small teams building a listings app with AI tools, where the first version “works” but starts breaking under real data.

Define the data you actually need

Before you build screens, decide what data you’ll trust on day one. AI tools can produce something that looks finished quickly, but imports and messy fields will break it the moment real data arrives.

Start by choosing your first data source:

CSV is easiest for a first launch, but it often hides inconsistent columns and surprise formats (dates, prices, addresses).
A live feed is usually better long-term, but it adds scheduling, rules, and more ways to fail.
Manual entry is cleanest, but only works if someone owns the job of keeping listings current.

Mixed sources are normal, but only if you decide which source wins when two records disagree.

Next, define what “published” means. Who can publish listings: admins only, agents, or anyone on the team? Can a listing exist as a draft but stay hidden until photos are approved and the address is validated? If you don’t define this up front, drafts leak into search and you end up with “half listings” in the wild.

Then pick the filters you actually need and ignore the rest for now. Most buyers start with price, beds, baths, city or neighborhood, and a simple status (active, pending, sold). Extra filters can wait until you see real usage.

Finally, define uniqueness before importing anything. Decide what makes one listing “the same” across sources. If you skip this step, every later decision becomes a patch.

A practical approach:

Primary identity (best): MLS ID or provider listing ID.
Fallback identity (when no ID exists): a normalized address plus unit, paired with one or two stabilizers like list price and agent or office.
Update rules: price drops and photo updates should update an existing listing, not create a new one.
Multi-unit rules: one building with many units should not collapse into one record.
Merge rules: if duplicates happen, you need a predictable way to combine them and pick a source of truth.

A simple data model that survives imports

Many listings apps begin as quick prototypes, then fall apart once you import real data. A stable data model keeps imports predictable, makes changes easier to detect, and prevents “mystery edits” nobody can explain later.

A simple model that holds up well looks like five core tables (or collections):

Listings
People (agents and brokers)
Photos
Locations
Import sources (feeds, files, providers)

Keep relationships easy to reason about: a listing belongs to one location, has one agent (and optionally a broker), and has many photos.

To avoid broken pages, be strict about what’s required.

On the listing, treat these as required: the source feed name, the source listing ID, status, address fields, postal code, price (or rent), and property type. Bedrooms, bathrooms, square footage, year built, and description are useful, but they’re not safe to require if you’re importing from real-world files.

For photos, require the listing ID, a stable photo identity (source photo ID or a hash of the source URL), and a sort order. Everything else is optional metadata.

To track updates, store source IDs exactly as the feed provides them, plus a stable internal ID. Enforce uniqueness on (source_feed, source_listing_id). Also store timestamps like source_updated_at and your own imported_at.

For sold or removed listings, avoid hard deletes. Use a status like active, pending, sold, or off_market, plus a removed_at timestamp. Keeping the record prevents broken bookmarks and keeps analytics consistent. If your feed flips back and forth, a simple status history record (listing_id, status, changed_at, source_reason) saves hours later.

Import planning that won’t break production

Before you import anything, decide what kind of import you’re building.

A one-time migration can be strict and slow, because you run it once and fix issues as you go. A daily sync needs to be boring and predictable, because it runs forever and small mistakes pile up fast.

The easiest way to keep the site responsive is to separate importing from the public website. Treat imports like background work. Users should be able to browse and search while new rows are processed behind the scenes. If an AI-built app tries to import thousands of rows during a page request, you get timeouts, half-written data, and confused users.

Track every import run like a small report. When something looks wrong, you want answers quickly.

At minimum, record:

start and finish time
how many listings were created, updated, skipped, and failed
a short reason for each failure (missing address, invalid price, bad photo URL)
which file or feed version was used
who triggered the import (manual, teammate, scheduled job)

Also plan for bad files. You don’t need a perfect “undo everything” system on day one, but you do need a safe fallback. A practical option is “undo last import”: tag all records touched by a run so you can revert those changes or disable the affected listings if the file was wrong.

How to prevent duplicates before they spread

Duplicates rarely look like perfect copies. They start small, then multiply: one listing from an MLS feed, the same home from a broker CSV, then a manual edit that creates a third record. Imports can run fast, but you still need clear rules.

Start with a match key strategy. If a source gives you a stable listing ID, treat it as the identity for that source. Store both source name and source ID, and enforce uniqueness on that pair. When the source ID is missing or unreliable, fall back to your own matching rules.

Address formatting is the usual trap. “12 Main St Apt 4B” and “12 Main Street Unit 4B” are the same place, but string comparisons miss it. Normalize what you can (case, punctuation, common abbreviations) and store unit information separately so it doesn’t get lost inside the street line.

A fallback duplicate check can be simple and still effective: normalized street number and street name plus postal code, with unit type and unit number treated consistently (Apt and Unit treated the same). City and state can be a secondary check. If you have latitude and longitude, they can help too, but don’t rely on them as your only key.

Next, decide what happens on conflict. When the same home appears again, do you overwrite fields, merge them, or stop and ask for review? Many teams overwrite “fresh” fields like price and status, but preserve human-edited text like descriptions. The key is to choose rules you can explain and repeat.

Finally, create a “possible duplicates” list for humans. Don’t block the entire import. Flag likely matches and let someone confirm, merge, or ignore them before duplicates spread into search results and saved favorites.

Photos: the fastest way to slow a site (and how to avoid it)

Get a free import audit

FixMyMess reviews your sync, logs, and idempotency so re-imports stop creating duplicates.

Get Audit

Photos make listings feel real, but they’re also the easiest way to turn a fast property search into a slow one. This often happens when the importer grabs original MLS images (very large) and the UI downloads them all at once.

Start by setting hard limits on images at the door. On upload and on import, cap maximum pixel size and file size. You can keep originals in storage if you need them later, but your app shouldn’t serve them to most visitors.

The biggest win is to generate multiple sizes and serve the smallest one that fits the layout. One huge image everywhere forces mobile users to download desktop-sized files.

A small set covers most apps:

Thumbnail for search results and small grids
Card for listing cards
Full for the gallery on the listing page

If you need zoom, add a high-res version, but only load it when the user asks for it.

Also load images only when they’re needed. On long results pages, lazy loading keeps the first screen fast even when there are hundreds of listings.

Assume your data will have gaps. Imports often include broken image URLs, missing photos, or timeouts from the source. Pages should still render quickly with placeholders and sensible timeouts. Avoid retrying failed images in a tight loop.

A simple example shows how quickly things can go wrong: import 5,000 listings with 20 photos each. If each photo is 4 to 8 MB and the results page shows 30 listings, you can force hundreds of megabytes onto one scroll. With capped sizes, pre-generated variants, and lazy loading, the same page stays responsive.

Keep search and browsing fast

Search often feels fine in a prototype, then slows down once real data lands. The fix is usually straightforward: make results pages cheap to generate, and only do heavier work when the user asks for it.

On results pages, query only what you show. If the card displays address, price, beds, baths, one thumbnail, and “days on market,” don’t also fetch long descriptions, full photo sets, agent notes, or similar homes. Those belong on the detail page.

Pagination and defaults matter more than most teams expect. A fast first page keeps people browsing and reduces load on your database. Keep page size sensible (often 20 to 40). Use a default sort users understand (newest, recently updated). If you offer radius searches, cap extremes so one request can’t become “search the entire state.”

Caching is the next lever. Many visitors run the same searches: “2 bed under $600k in Austin” or “condos downtown.” Caching those responses for even 30 to 120 seconds can cut repeated work during busy bursts.

Map views are useful, but they can wreck first load if they block rendering. Treat maps as optional: load the list first, then render the map when the user opens the map tab or toggles it on.

Step by step: a safe first import

Add import logging and history

Add import run reports and change history so you can explain every update.

Add Logging

The first import is where listings apps usually go off the rails. Treat the first run like a controlled test, not a one-shot “load everything” event.

Start small and make sure your rules work before you scale.

Import a real sample of 50 to 200 listings. You want messy addresses, missing fields, and odd photo sets.
Confirm your identity rules. Import the same sample twice and verify you update records instead of duplicating them.
Process photos up front. Generate a few fixed sizes (thumbnail, card, full) and store metadata like width, height, file size, and a primary photo flag.
Run the full import in the background. Track progress and failures, and keep a clear “last successful import” timestamp.
Verify counts, then spot-check. Compare totals (new, updated, skipped, failed). Open a handful of random listings and check key fields, map pins, and photo galleries.

Before you call it done, run a few quick checks:

re-import the same file and confirm the listing count doesn’t grow
pick one listing and confirm edits get overwritten only when they should
load a listing page on mobile data and confirm photos stay snappy

Common mistakes that create slow pages and bad data

The fastest way to break trust in a listings product is to ship pages that load slowly and data that changes for no clear reason. These issues show up often when a prototype meets real users.

A few mistakes cause most of the pain:

importing straight into production without a sample run, so one bad column poisons thousands of rows
letting the schema form before you decide the unique key for a listing, which makes duplicates inevitable
keeping only original huge photos and resizing them on every page view, which burns CPU and slows pages
skipping import logging and change history, so you can’t answer “why did this listing change price?”
mixing draft and published state in the same field (or inferring it from missing data), which leaks unfinished listings into search

One realistic example: you import a broker CSV twice, but the second file uses slightly different address formatting (“St” vs “Street”). Without a true unique key, you end up with two copies of the same home, two photo sets, and confused buyers.

A few habits prevent big cleanups later. Run every new feed in staging first and compare counts. Pre-generate image sizes once at upload time. Write an import log entry for each run with counts and errors. Keep status fields explicit (draft, published, archived) instead of overloading one column.

Quick checks before you scale up

Before you load 10,000 listings, run a few checks that tell you whether the system will stay clean and fast.

Start with idempotency. A safe import is idempotent: run the same file twice and the result is the same. Import once, note totals, then import again and confirm you didn’t create extra listings, photos, or agents.

Next, confirm you can explain what happened during the last run. If a broker calls and says “Listing 482 changed price yesterday,” you should be able to answer without guessing.

A short checklist is enough:

re-import test passes (no extra rows, no duplicated photos)
import log shows created, updated, skipped, failed, and why
listing pages stay fast on a mid-range phone over cellular
missing photos show placeholders without blocking page render
search and filters stay responsive when you simulate 10,000+ listings

One concrete test: pick a listing, change its price in the source file, re-import, then verify only that field changed. If your app creates a new row instead, duplicates will spread quickly.

Example scenario: a broker CSV import that nearly derails launch

Turn your prototype production-ready

Turn an AI-built prototype into reliable software, usually in 48 to 72 hours.

Start Repair

A solo founder builds a real estate listings app with AI tools. A local broker sends a CSV with 5,000 listings plus a folder of photos. The founder runs an import, sees listings appear on the site, and assumes they’re done.

Two days later, complaints start. Some homes show up twice. Others have mismatched photos. The homepage feels slow, especially on phones.

Problem one is duplicates caused by small address differences. One row says “12 W Main St,” another says “12 West Main Street.” Some include unit numbers, some don’t. If the app treats the full address string as identity, it creates a new record every time formatting changes.

Problem two is photos. The broker photos are huge (often 4000 to 6000px wide). If those originals are served directly, each listing page becomes heavy and scrolling turns into stutter.

The fix is a simple plan before the next import. Use a stable match key first (broker listing ID, MLS ID, or a provided internal ID). If you don’t have one, create a fingerprint from normalized fields (cleaned street, city, ZIP, plus one or two stabilizers like beds/baths and list price) and send “almost matches” to a review queue instead of auto-creating.

For photos, store the original once, generate resized versions on upload (a thumbnail and a 1600px max image covers most needs), and serve the smaller versions by default.

After the fix, the founder tracks a few numbers on every import run: typical listing page load time, import duration for 5,000 rows, how many duplicates were merged or queued, and the total photo weight per page.

Next steps to ship without repainting the house later

Before you add more sources and more photos, get your rules on paper. A listings app can look done while the data layer is still fragile. One clear document saves weeks of rework.

Write down your unique key and conflict rules in one place. For example: “A listing is the same property when (source + source_listing_id) matches, and it’s probably the same when (address + unit + postal code) matches.” Decide what wins when fields disagree (price, status, bedrooms, open house times) and how you handle removals (inactive vs deleted).

Then schedule a test run that includes both import and images. Use a small but realistic batch, like 200 to 500 listings with full photo sets, so you can see timeouts, duplicates, and oversized images before they hit real users.

Keep the checklist simple:

run one import twice and confirm the second run creates zero new listings
log every conflict decision (what changed and why)
process photos end to end and confirm page speed stays acceptable
check search results for obvious duplicates and missing updates
verify secrets and credentials aren’t exposed during the job

If you inherited an AI-generated codebase that’s already slow or messy, a focused audit of imports, duplicate rules, and image handling usually surfaces the real issue quickly. Teams like FixMyMess (fixmymess.ai) specialize in diagnosing and repairing AI-generated prototypes so they’re safe to run in production, especially around broken import logic, tangled schemas, and performance problems that only show up at real scale.

FAQ

What’s a “good” speed target for a listings app?

Aim for this baseline: search results should feel near-instant, and listing pages should render meaningful content in under a couple of seconds on mobile data. If you’re seeing blank photo boxes, filters resetting, or multi-second delays after imports, fix imports and images first because they usually cause the biggest slowdowns.

What should I use as the unique ID so imports don’t create duplicates?

A safe default is source_feed + source_listing_id as the unique key, enforced in your database. If a feed doesn’t provide a stable ID, use a normalized address plus unit, then add one or two stabilizers like postal code and list price so minor formatting changes don’t create new records.

How do I make sure re-importing the same file doesn’t add more listings?

Make your import idempotent: importing the same file twice should not change counts. The simplest approach is “upsert” by your unique key, and for photos also enforce uniqueness using a source photo ID or a hash of the source photo URL so the importer can update rather than append.

Should imports run inside the web app request or in the background?

Do imports as background work, not during a page request. If you import inside a web request, users will hit timeouts, you’ll get partially written data, and the app will feel unreliable. Keep browsing and search separate from importing so the site stays usable while data syncs.

When the same listing arrives twice, should I overwrite, merge, or stop the import?

Use clear rules: overwrite “fresh” machine-fed fields like price, status, and open house times, and preserve human-edited fields like custom descriptions unless you explicitly choose otherwise. The key is consistency so you can explain later why a field changed.

What’s the simplest photo setup that keeps pages fast?

Serve resized variants by default and never serve the original huge image to most users. Generate a small set of sizes during import or upload, then load only what’s needed on the page; this usually fixes the “slow and stuttery” feel immediately.

Should I delete sold/removed listings or keep them?

Don’t hard delete; mark listings as off market or removed and record when it happened. Keeping the record prevents broken bookmarks and avoids confusing analytics, and it also helps when feeds temporarily drop and later restore a listing.

Which fields should be required to avoid broken listing pages?

Require only the fields you truly need to render a safe page and to match records on re-import, such as source info, status, address basics, postal code, and price/rent. Treat optional fields like beds, baths, square footage, and description as “nice to have” because real imports often have gaps.

How do I keep search and filters fast as listings grow?

Fetch only what you display on the results card, and keep heavy data like full photo sets and long descriptions for the detail page. Add sensible pagination, avoid extreme radius searches, and consider short-lived caching for popular queries so repeated searches don’t hammer your database.

When should I bring in help to fix an AI-built listings app?

If an AI-generated codebase is timing out on imports, creating duplicates you can’t explain, or serving oversized images, an audit usually finds the root cause quickly. FixMyMess specializes in diagnosing and repairing AI-generated prototypes so they’re production-ready, often within 48–72 hours, starting with a free code audit.