In the College of Engineering, students and faculty are using data science to develop groundbreaking tools and tackle real-world challenges. Through the ongoing series, “Data for Good: Innovators in Action,” we highlight how these efforts are driving meaningful change across various societal issues. Join us in exploring these stories and discover how you can be a part of the movement to create a brighter, more equitable world.
History lecturer Cameron Jones wasn’t expecting to uncover a forgotten chapter of California’s past while studying missionaries in the Amazon. But as he pored over mission records, a striking pattern emerged — many of the people mentioned were of African descent.
Intrigued by his discovery, Jones shifted his focus back to California, where he visited the Santa Barbara Mission archives. As he combed through centuries-old documents, a staff member approached him and said, “We have been waiting years for someone to do this project.”
That moment sparked a groundbreaking effort to recover the stories of California’s African-descended population between 1769 and 1850, a narrative that had largely faded from view. Today, an interdisciplinary team at Cal Poly is using advanced data science to bring those stories back into focus.
“We’re reconstructing a past that was nearly erased,” Jones said. “It’s about more than just uncovering names — it’s about understanding how these communities formed, thrived and contributed to California’s history.”
The yearslong project has produced resources like detailed family trees, digitized census records and interactive visualizations. These tools map the lives of African-descended Californians, or African Californios, allowing researchers and the public to explore their contributions — from land ownership to social structures — with ongoing efforts to reveal even more about their legacy.
By blending archival research with innovative data science, the team is deepening our understanding of the past and transforming how we engage with California’s history.
As it turned out, computer science Professor Foaad Khosmood had a keen interest in California’s beginnings. When he learned about Jones’ research, he realized data science could help unearth relationships within the underrepresented population of African Californios.
“Very few people have studied the African Californios, and no one has approached it with computational tools,” Khosmood said. “I knew we could use visualizations and digital tools to bring this story to a wider audience.”
To understand the significance of this project, it’s important to look at the early demographics of California. In 1790, nearly one in five nonnative Californians were of African descent, with the largest communities in Los Angeles and San Jose.
As Spain sought to strengthen its hold on California, it expanded its military presence in the region by enlisting people of African descent and individuals of mixed heritage. By 1814, records from a mission in San Luis Obispo noted that five of the six soldiers stationed there were of African descent, as the Spanish recruited beyond native Spaniards to protect their territories.
African-descended individuals played an essential role in California’s development, and the final Spanish census in 1821, conducted just before the transition to Mexican control, reaffirmed their continued presence in shaping the region.
One prominent figure during this time was Pio Pico, the last governor of Mexican California, whose African ancestry has made him a key part of the project’s research.
“Not many realize there were that many Black people in early California,” Jones said.
To bring the African Californios’ history to life, Jones and Khosmood looked to develop a system to match individuals of African descent across historical documents, which would then allow them to construct detailed family trees. While it sounds straightforward, the process proved challenging due to incomplete records or discrepancies, making it a significant task to connect the dots across layers of historical data.
Jones and Khosmood relied on the Early California Population Project — a digital database containing baptism, marriage and burial records from California’s missions. However, the records lacked one crucial detail: race. To fill this gap, they turned to census records, which included some racial information. Merging these sources — matching names and data across censuses and mission records — presented its own complexities.
Census records were in physical form, so over several months, Jones and his students worked to scan the data into digital files. The team then encountered additional complications with different spellings, accented letters and name variations.
To tackle these differences, the team modified an algorithm used to compare text strings. Spanish names, with variations like “S” and “Z,” needed further customization, so they developed a list of letter substitutions specific to colonial Spanish, which improved the algorithm’s accuracy.
“Data allows us to piece together details that might otherwise remain fragmented, giving us a more complete and nuanced understanding of the past,” Khosmood said.
A driving force behind the project’s technical advances is Anthony Colin Herrera, a computer science master’s student from Bakersfield, California, who brought both his expertise and a deep connection to his heritage.
“When I learned about the African Californios, I was struck by how little-known their story is,” Colin Herrera said. “My background made this project especially meaningful, and working with real-world data felt like the perfect way to honor that history.”
Colin Herrera, fluent in Spanish, identified “family units” based on shared last names, parent-child relationships and spousal connections. From these units, he traced generations and built family trees linking parents and children across multiple datasets.
Their research is available to the public through AfricanCalifornios.org, where users can delve into the project’s findings, browse family trees and view visualizations that illustrate the impact of African-descended individuals in early California.
A major milestone came this summer when a group from Cal Poly, including Colin Herrera, presented the project at DH 2024, the annual conference of the Alliance of Digital Humanities Organization. The presentation highlighted their progress in using data science to reconstruct family histories, sparking interest and questions from scholars eager to explore the project’s future potential.
“I was nervous to present in front of experts in the field of digital humanities,” Colin Herrera admitted. “But knowing I was representing Cal Poly, the Computer Science Department, and our research team, I knew I had to give it my best.” It was his first conference and the first time he had been included as an author on a published paper.
Firsts are nothing new for Colin Herrera. As the first in his extended family to attend college — and soon to earn his master’s degree — he’s forging a path for himself and future generations. “I wanted to be the first to go to college, and now, I’m getting my master’s,” he said, noting that some of his family members didn’t even finish high school.
He will defend his thesis this spring, after spending the year further developing family trees based on the project’s data. Next on the agenda, the team will use natural language processing tools to analyze a scanned book of colonial-era land grants, extracting details like people, places and plot sizes.
As the team continues their work, Jones reflected on the importance of reclaiming these narratives: “We know a lot about the wealthy, powerful white settlers but much less about the people of color who played vital roles in shaping our state’s history,” Jones said. “California’s past is rich with diversity, far beyond what many realize.”
Housed within the College of Engineering, the Noyce School of Applied Computing brings together Electrical Engineering, Computer Science and Software Engineering, and Computer Engineering, with Statistics as an affiliate. The school provides a collaborative environment where students and faculty in fields like computer science apply data science to solve complex, real-world problems.
Call to Action: Engage with History!
We invite educators, students and history enthusiasts to explore the African Californios website as a valuable resource for learning and teaching. Integrate these untold stories into broader narratives about California’s complex past, fostering a deeper understanding of our cultural heritage.
By sharing this resource, you can help highlight the contributions of African descendants and inspire important conversations about inclusivity in our historical discourse.
Let’s enrich our knowledge and embrace the diverse chapters that make up California’s history!
By Emily Slater