talk to the world's public data

turbo charge your LLM by connecting it to the world’s largest public database

01 / who it’s for

built for the analysts doing the work that matters.

academic researchersskip the cleanup. start with the question.
think tanksproduce policy briefs grounded in joined evidence.
nonprofitsmeasure impact against open public baselines.
corporate analystsmacroeconomic context on demand, grounded in source data.
independent analystspublish real work on real data. get spotlighted.

02 / faq

questions.

what is point luna?

point luna is an organization on a mission to build the largest relational database of public data in the world, and integrate it with ai agents to automate data analysis and democratize decision-making.

our first product is an mcp server that connects ai clients — claude, cursor, chatgpt, and others — to point luna and queries the warehouse on their behalf. it’s free to get started. sign up.

how does point luna work?

we collect public datasets, ingest them, clean them up, document them, and make them easy to query — all joined together at the geographic level so you can ask questions across data that historically lived in incompatible silos. this lets us expand easily.

the default unit of analysis is the county, with state, metro, and zip available where the underlying data supports it. counties give us a useful balance: granular enough to surface real local variation, broad enough that most public data is reliably reported at that grain, and stable enough that you can build time-series across decades.

sources include the U.S. Census Bureau, BLS, BEA, CDC, IRS, HUD, FHFA, EPA, FBI, NHTSA, the Department of Education, MIT Election Lab, federal program records, and a long tail of state and local agencies. the full catalog and current ingestion status live in our data library.

what counts as a “public” dataset?

some concrete examples to give you an idea:

federal and state agency data released under open data policies
academic datasets released under permissive licenses
aggregated administrative records that have been de-identified and published

we do not include:

anything containing pii or anything that could be re-identified
proprietary or licensed data, even if it’s been leaked or scraped
data behind paywalls or restricted access agreements
anything we can’t fully document the provenance of

every dataset in point luna is documented end-to-end: where it came from, when we pulled it, what we changed during cleaning, and what its known limitations are.

what data is included today?

we’re currently focused on U.S. data at the county, metro, and state level, organized into the domains people actually make decisions on:

economy — employment, wages, industry mix, ai exposure, business dynamism
demographics — population, race/ethnicity, age, education, migration, language
health — mortality, chronic disease, healthcare access, healthcare cost, environmental health
housing — home values, rents, cost burden, mortgage activity, vacancy, supply
education — test scores, attainment, enrollment, school finance, post-college outcomes
elections — federal and state-level results, partisan trends, competitiveness
energy & climate — air quality, emissions, electricity costs, climate exposure (early)
immigration — foreign-born population, language, naturalization (early)

coverage isn’t uniform — some domains have a decade of clean time-series, others are still being built out. the honest answer for what’s ready today is in the dataset catalog, with reliability tiers and coverage notes on every table.

what's coming next?

hyperlocal data. 311 records, building permits, local crime data, and other municipal sources. most U.S. metros publish this, but it’s a mess of incompatible formats. we’re prioritizing the cities where there’s a clear use case — tell us if you have one.

international data. Canada, UK, EU, and India are next on the roadmap. cross-country comparison is one of the highest-leverage things you can do with structured public data, especially for policy questions.

more domains. crime, transportation, social safety net, mobility/opportunity, family/aging, and democracy indicators are all on the longer-term list.

if there’s a use case driving you to ask, tell us — we prioritize based on what people will actually use.

how do i add data to point luna?

you can tell us about data you’d like to see added. at minimum: name the dataset and explain your use case, and we’ll review and provide a timeline if accepted.

every addition goes through our review process before it lands in the public schema. we check:

provenance — where the data comes from, who publishes it, and whether the license allows redistribution
refresh cadence and reliability — how often it updates, how stable the schema is, and how trustworthy the source has been historically
overlap — whether we already have a comparable or better source covering the same ground
integrity — automated and manual checks for cleanliness, completeness, and consistency before it’s exposed to users
use case — why is this data important and actionable?

datasets that pass review get a public manifest, a reliability tier, and a documentation page. datasets that fail get an honest writeup of why we’re not including them, so the next person asking the same question doesn’t have to redo the work.

what does it cost?

point luna is free to get started. we expect to introduce paid tiers but we are committed to keeping subscription costs low because we genuinely believe this is something the world needs.

how do i access the data?

point luna runs through your ai client over the mcp. connect the mcp →

who's behind this?

point luna is built by a small team with backgrounds in data science and software engineering in big tech. if you want to contribute — code, data, analysis, editorial work, or just sharp questions — we want to hear from you.

how do i get started?

browse the data — check our dataset catalog →
connect — follow the onboarding guide →; it walks through your first analysis in a few minutes
get in touch — for data requests, collaboration, or use cases we haven’t thought of yet, email us

talk to the world's public data

built for the analysts doing the work that matters.

questions.

ready to see what public data can actually do?