Looking at genealogy from the ground up.

If you’ve been following this blog for a while, you’ll know I’ve been researching Chinese Case files for years. See my posts Chinese Immigration Act Case Files: Finding aids at LAC (2020), ¹ The startling details of a Chinese Case File (2023),² and How to Access Chinese Case Files at LAC (2025).³ Before I go further, I want to emphasize I am not on this journey alone, and I benefit hugely from the volunteer efforts and expertise of others. If you’d like to see what we’re up to, come join us in my private Facebook group Genealogy for Asian Canadians, which we affectionately call GFAC.⁴

The problem with Chinese Case files

Genealogy is the gift that keeps on giving. Every single record answers a few questions and asks many more. In Chinese genealogy, where so many gaps exist, we squeeze every document for details: the marginalia, the weird references, the signatures of the controllers. We double check birth and arrival dates. We add the new spelling of our ancestor’s name to our lists. We do this because we usually have too few records to find.

With Chinese Case files, we have the opposite problem. There’s too much to process. A case file can run from a few dozen to a few hundred pages. Our group’s success in developing an acquisition methodology, plus the efforts of the archivists at Library and Archives Canada, have combined to form a pipeline to the richest vein of records in Chinese Canadian genealogy: Case Files.

And that leads me to the next problem. Now that we have them, how can we understand them?

Chinese Case files were too hard to analyze…

In my four-part blog series, I wrote the story of Quon Hing and his attempts to bring his sons to Canada.⁵ (His Case File is the entry image for this blog.) The series demonstrates the tools I used, from printing the file and reorganizing it into groups, to analyzing the types of records in a spreadsheet, to recording the cross-referenced file numbers. I built org charts to understand the chain of command. I researched every name and their titles: Chief Controller, Member of Parliament, Minister of Trade. I created a timeline of events. I mapped locations in China. Each blog took a weekend to research and write, totalling two weeks of intensive work. I did my best but missed important details. Seen individually, Case Files made no sense. I was applying my too scanty knowledge to specifics. There were too few records outside Case Files to understand the gaps, and too little research done on them. Where was the research on Chinese Case file methodology? It was uncharted territory, which is both great and terrible in equal measure.

I’ve been asked for help to understand Chinese Case files but I couldn’t accept the work because I didn’t think the fees I’d need to charge would cover the time it would take. I needed a better process.

… And then came AI.

Two years later, the idea hit me. See my post, Me and ChatGPT, a story.⁶ What if I could build an analytical framework to help me understand Chinese Case files? What if I could develop a process that took weeks of intensive work and made it an accessible, reliable, and repeatable workflow?

I said to ChatGPT 5.2 (Chat, from now on), ⁷

[After feeding Chat two sample Case files] I want to better understand the process by which Chinese Case files were created by the Canadian government in its various iterations 1900-1999. Think of this as more contextual work. I am trying to understand a government process by the paper trails left behind and my understanding is subject to misunderstandings based on lack of data. I want to give you a dozen links to build a new analytical framework. Is this the right way to go or do you suggest a better way?

Chat said [edited for brevity]:

You can give me a dozen links, and I can work through all of them … and produce a new, process-oriented analytical framework. But there is a … more accurate approach… Treat This as a “Process Reconstruction” Project… Chinese Case Files were not produced by a single workflow— they emerged from multiple statutes, multiple administrative regimes, multiple rewriting of forms, policy memos, Ministerial Orders, and departmental reorganizations (Interior → Immigration → Labour → Citizenship and Immigration → Employment and Immigration, etc).

Why was my question the wrong question to ask?

Here’s the difference between what I envisioned and what Chat recommended.

I was thinking: give Chat information from the folks that have studied the records, plus a few blogs I’d written. It would have been original work.

The flaw was that I was proposing to study fragmented collections long after their creation. Chat was proposing we reconstruct a foundational legal/analytical framework from the laws themselves. It was the genealogist in me – studying history from present to past. Chat suggested a much more ambitious project: start at the beginning. It was breathtaking. Audacious. Too much maybe. But I’m a methodologist as much as a genealogist. I think about life in terms of methods, process, and how did this happen?

What if this was possible? I was willing to try. Test it out. See if it worked.

Building an analytical framework

And that is how I’ve spent almost two full months, outside unavoidable obligations. Initially I looked for legal instruments: statutes, laws, and amendments. Then I looked for documents that showed how the laws were operationalized: regulations, memos, and internal correspondence. I needed copies of the Orders In Council (OICs), where the Privy Council made a constant stream of immigration-related decisions.⁸ I looked for the key legal cases, such as Tai Sing v. Maguire (1878), Union Colliery Co. of British Columbia v. Bryden (1899), Cunningham v. Tomey Homma (Judicial Committee of the Privy Council, 1902), and Quon Wing v. The King (1914).

Eventually I realized that while I had focused on the Chinese Immigration Act, I was also going to need to look at the Immigration Act and the British North America Act (BNA).⁹ I gathered Chinese Immigration records from C.I.#1 to C.I.#50. I located ledgers, lists, and manifests. There was no hope of doing all this in a systemic order, and that’s how I went from framework versions one to fourteen, because after I found the BNA, everything afterwards was reframed in context. I was also learning how to work with Chat, which is akin to building the plane as you’re flying it.

It went like this. Chat suggested the framework – the organization of how we were going to analyze and place the results. I looked for the laws. Chat revised its framework based on the results. I found more laws. Rinse and repeat. It was a process very familiar to genealogists – each record inspires a dozen questions and rabbit holes – and I struggled to stay on track. I was fascinated by the process of feeding a document into Chat and having it summarize the key points.

Over the weeks that followed, as the framework developed, I saw what I’d been missing appear like a photo in the developer bath: granular historic context. It was magical and I was hooked.

Here are insights I gained from looking at the history of Canada through its anti-Chinese bureaucracy. Before 1923, Chinese immigration operated as an imperfect system, not a series of laws. There were too many what ifs and specific cases for the laws to handle, so the Controllers and their officers had wide discretionary powers to make small, discrete decisions. Individual cases were treated as exceptions, and exceptional cases created paperwork. The surviving Chinese Case files create a research bias that wasn’t intended in the original system. Put another way, Case Files may be seen as a collection of exceptions. They are what was created when Chinese persons tried to navigate a system designed to prevent them from doing what they wanted to do.

Another insight that came clear is how race, or the fact of being Chinese, became a bureaucratic checkpoint. Under exclusion, human rights were superseded by the colour of your skin. In the story of Quon Hing, he was unable to secure his son’s 1938 entry to Canada despite Quon being naturalized, and the boy at great risk of dying in the war in China. The Controller privately said he could make an exception but there were many cases and to admit one would be to admit them all. The Case Files are filled with examples of people who used all the resources at their disposal – lawyers, powerful friends, and the money needed to go the legal route – and failed. Race was the reason for the immovable bureaucracy.

Some people have said that Case Files don’t tell the whole truth. That many people lied. At some point I might be in a position to opine on what was truth and what was a lie, but for the time being I’m focused on what was systemically intended and what archivally survived.

What I learned about ChatGPT

I learned a lot about Chat – oh so much – along the way.

You can create private environments within Chat for analysis. I call them ringfences.
Chat doesn’t build things for you unless you ask. I was ten days into building the framework when Chat said, Upload the current framework, and I said, What framework? Until that point, I didn’t realize I was supposed to be cutting and pasting the results into a separate doc. Oops.
Chat doesn’t remember things perfectly. It’s not a human. While I was learning in leaps and bounds, Chat was forgetting details from conversations a few days ago. I needed to start new sessions with a review, preferably by uploading the latest copy of the framework.
For this reason – call it data rot – it is recommended to store important items outside the Chat environment. I’ve been storing my prompts in Zotero, and important conversations in Evernote.
Chat reads handwriting badly and has great difficulty with non-OCR’d documents. If you don’t verify the results, you’re at risk of creative (read: imaginary) results.
Chat can verify if you ask it to verify. Silly me, not thinking to ask.
Chat can create citations if you provide all the citation info. I only started doing this midway through, which means my citations are somewhat sketchy. (You can create a prompt workflow for citations but I’m getting ahead of myself.)

What I learned about the network of anti-Chinese laws

I learned even more about the Canadian system, or put another way, how Chinese immigration was an administrative nightmare where the people paid to run it themselves struggled to understand the labyrinth of laws, regulations, OICs, and last-minute decisions.

The early system was incoherent and exception-based, i.e., bonding for students
Chinese were subject to at least two major immigration systems and could fail either or both: the Immigration Acts and the Chinese Immigration Acts
The survival of some records such as CI9s has caused them to be treated with more authority than the system intended.

What We Tried That Failed

I asked Chat to list our early failures. Each of these is worthy of a blog post alone. Here’s what it said:

Early outlines organized strictly by legislation rather than administration
Attempting a single unified chronology too early
Trying to normalize terminology across time periods
Assuming consistent definitions of “merchant,” “student,” “wife,” or “dependent”
Treating CI forms as stable instruments rather than evolving tools
Expecting post-hoc explanations in files to reflect original reasoning
Attempting to fold post-1923 cases into the same analytical structure
Assuming denial decisions would always cite legal authority clearly
Believing that edge cases could be handled as footnotes
Underestimating how much discretionary refusal language mattered
Trying to compress the framework too early for usability

Closing Thoughts

I’ll close with my thoughts on what I’ve achieved. Using artificial intelligence, I have created a work I never would have been able to do on my own. I have struggled to understand laws separately, such as the Chinese Immigration Act, 1885, the Revised Statutes of Canada, 1906, and the Chinese Immigration Act, 1923. My analytical framework is a synthesis of dozens of laws, from the British North America Act, 1867 to the repeal of Chinese exclusion in 1947. It’s not perfect, as a historically accurate work never can be, but it’s a foundation. It’s a process, a developing method, and a tool. Soon I might be able to use it to analyze cases.

Image created by ChatGPT by the author, 31 Jan 2026.

For fun, I asked Chat to create an image of the work we did. If it passed the smell test, I’d use it as the entry image. The first two attempts included a map of the U.S., and the Statue of Liberty. In this, the third try, I liked the image of the Case File and then realized I had actual, real images of Case Files I could use instead. The headline image is from my work on Quon Hing, from 2023. Wow, what a lot we’ve all learned since then.

What’s next?

At some future point, I’ll tackle the post-Exclusion era in a separate analytical framework (which if anything is even more difficult to understand than pre-Exclusion where at least the laws were overt). In my blog post, Order-in-Council PC 2115: When immigration met the X-ray machine (2021), I talked about immigration after 1947, when Canada used skeletal size as a reason to keep Chinese out.¹⁰ I’d like to bring the kind of AI-supported research techniques I’ve used with Chat from 1867-1923 to the post-Exclusion period: the X-ray era of enforcement and prohibition.

Now I’m developing usable prompts to understand case files. In this blog, you’ll see my progress so far:

ideate a process for understanding case files
expand the scope to look at the environment that created case files
iterate sixteen versions of the framework until it’s workable
create prompts that analyze case files

What’s a prompt? Why did it take a full working day to write them? I’ll tell all in my next post.

And after that? After the building, prompt developing, and testing? Oh, I have such plans. I am envisioning templates, white papers, journal articles, and services to aid my fellow genealogists. This is only the beginning.

Thank yous

Thanks go to the folks in the Facebook group Genealogy and Artificial Intelligence.¹¹ The conversations are as educational as entertaining. A special shout out to Mark Thompson, for his comment that I was undertaking a big, hairy, audacious goal, and inspired today’s blog post title.

Thanks equally to the archivists at Library and Archives Canada – you know who you are – who have worked diligently to complete Chinese Case file accessions, update finding aids, create new finding aids, and verify the contents of those long-ignored boxes. On a related note, thank you to LAC for the opportunity to meet regularly, collaborate and share, and get to know one another. It has been a pleasure working with you.

Finally, thank you to the volunteers in GFAC, who not only built the C.I.9 Lookup Tool (2023-26) but also a bigger, better lookup tool called the Query All Tool.¹² This second tool, now under development, has already changed my workflow in that it allows lookups by any name variant, any number, and returns all related records. We are planning to release the Query All Tool to select users for testing. Stay tuned for a coming blog post.

Acknowledgement

I worked with Chat intensively in developing the analytical framework, and was tempted to ask it to write this post. I realized I work best by writing and reflecting, and scaled back my request to instead suggest an outline. It was a good guide to keep me on track, but was too mechanical in tone. I kept the section about failures in its entirety. There are differing opinions about whether or not to acknowledge the use of Chat in your work, and I believe it is intellectually honest to disclose the particulars.

References

Linda Yip, (29 Aug 2020, updated 11 Mar 2025), Chinese Immigration Act Case Files: Finding aids at LAC ↩︎
Linda Yip, (10 Sep 2023), The startling details of a Chinese Case File, pt. 4 – How to get your ancestor’s file ↩︎
Linda Yip (31 Aug 2025), How to Access Chinese Case Files at LAC (2025) ↩︎
Linda Yip, Genealogy for Asian Canadians: https://www.facebook.com/groups/1923544271050858/. Be sure to answer the two intake questions so I know you’re not a bot. ↩︎
Linda Yip (7 Aug 2023), The startling details of a Chinese Case File – the story of Quon Hing, aka George Sing, pt. 1 ↩︎
Linda Yip (21 Dec 2025), Me and ChatGPT, a story ↩︎
I am using ChatGPT Plus, currently version 5.2. It’s a paid version. ↩︎
See my three part blog series on Orders in Council beginning with How to navigate Order-in-Council records, part 1: real life at LAC. See my collection of OICs at Orders-in-Council immigration regulations (1930-1960). Also see How to navigate Order-in-Council records part 3: online at Ancestry ↩︎
I have excluded researching voting (Dominion Elections Act, federally and provincially) or the Citizenship Act (1947). ↩︎
Linda Yip (20 Jun 2021), Order-in-Council PC 2115: When immigration met the X-ray machine ↩︎
Facebook, Genealogy and Artificial Intelligence: https://www.facebook.com/groups/genealogyandai/. ↩︎
See the guest post by Robert Louie (16 Dec 2023), Why do we care about the C.I.9s? – a guest post by Robert Louie ↩︎

4 thoughts on “My Big Hairy Audacious Goal – creating an analytical framework with ChatGPT”

Mark Thompson says:

February 1, 2026 at 12:34 pm

Linda, this is truly amazing work! The “big hairy audacious goal” label is even more fitting now that I’ve read what you actually built.

What strikes me most is how you’ve demonstrated that using AI isn’t about skipping over the hard analytical work—it’s about *expanding* what’s possible. Two months of intensive research and iteration, constantly verifying and refining. That’s rigour. That’s methodology. You’re a process geek after my own heart! The AI didn’t do this for you; you built something with it that you couldn’t have built alone, and those are two very different things.

I can also appreciate the weight of what you’re doing here. You’re not just creating a research tool. You’re trying to bring clarity to a system that was designed to be opaque, to exclude, and to deny. The fact that these Case Files might help Chinese Canadians make sense of documents about their own families, documents created by laws meant to keep them out… that matters in ways I can appreciate even if I can’t feel them the way you and the Chinese Canadian community do.

Thank you for sharing this, including the missteps along the way.

Mark.

Loading...

1. wanderernolonger says:
  
  February 1, 2026 at 4:10 pm
  
  Huge thanks for your input, Mark. There were at least five times when I thought, “oh I should have been doing that from the beginning,” but that’s the iterative process. I didn’t know where we’d stumble until we stumbled. Twice I thought we’d have to start all over again but iterative versions saved us. I learned only after processing a dozen or so laws that asking Chat to develop a prompt for the work was a better workflow than me thinking up questions. Oh so much learning. But don’t get me wrong – it was fascinating and I learned a ton. Now to see if this baby can fly. Lots more testing in my immediate future.
  
  Loading...
  
Pingback: Friday’s Family History Finds | Empty Branches on the Family Tree
Pingback: This week's crème de la crème - February 7, 2026 - Genealogy à la carteGenealogy à la carte

CommentsCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

My Big Hairy Audacious Goal – creating an analytical framework with ChatGPT

The problem with Chinese Case files

Chinese Case files were too hard to analyze…

… And then came AI.

Why was my question the wrong question to ask?

Building an analytical framework

What I learned about ChatGPT

What I learned about the network of anti-Chinese laws

What We Tried That Failed

Closing Thoughts

What’s next?

Thank yous

Acknowledgement

References

Like this:

Related

Published by wanderernolonger

Like this:

4 thoughts on “My Big Hairy Audacious Goal – creating an analytical framework with ChatGPT”

CommentsCancel reply

The problem with Chinese Case files

Chinese Case files were too hard to analyze…

… And then came AI.

Why was my question the wrong question to ask?

Building an analytical framework

What I learned about ChatGPT

What I learned about the network of anti-Chinese laws

What We Tried That Failed

Closing Thoughts

What’s next?

Thank yous

Acknowledgement

References

Share this:

Like this:

Related

Published by wanderernolonger

Share this:

Like this:

4 thoughts on “My Big Hairy Audacious Goal – creating an analytical framework with ChatGPT”

CommentsCancel reply

Discover more from Past Presence