What does it mean to design the 'guts' of something? What is it like to put together the scaffolding of a building — a building, say, made out of data? Data architects create blueprints for data management systems.
Let's explore this profession and answer some questions along the way. Why is there continued interest in this position and are there areas of specialization where knowledge is changing rapidly? What training and employment background is helpful? What certifications can help strengthen an individual's preparation and knowledge base for this role?
The data science workplace
I once watched an episode of Star Trek with a story arc about the relentless pursuit of data storage. One character makes an observation about "mankind's relentless pursuit" of collecting and categorizing everything that they see, hear and feel."
I feel comfortable that there will always be a continued need for data architecture and for someone to design and understand how data flows throughout various companies, organizations, and institutions. The interest and need for this workers skilled in this role will only increase as we move forward.
In the still-dawning era of Big Data and data science, it is incredibly important for any company to have a centralized data architecture aligned with business processes. Not only that, but the architecture in play has to scale with business growth and evolves to suit technological advancements.
A robust and secure — yes, data architects have to think about security at all times — data architecture provides clarity about every aspect of the data. That, in turn, enables data scientists to work with trustable data efficiently and to solve complex business problems.
Data architects must also improve operational efficiency by managing complex data and information delivery throughout the enterprise. Data and Big Data analytics have become the lifeblood of any successful business. Getting the technology right can be challenging, but building the right team with the right skills to undertake big data initiatives can be even harder.
Successfully deploying Big Data initiatives requires more than ata scientists and data analysts. It requires data architects who design the "blueprint" for your enterprise data management framework, and it requires data engineers who can build that framework and the data pipelines to bring in, process, and create business value out of data.
The role
So, beyond the metaphor, what exactly is it that a data architect does, and what are some of their job duties? After assessing a company's potential data sources (internal and external), architects design a plan to integrate, centralize, protect, and maintain them. This allows employees to access critical information in the right place, at the right time.
It sounds complicated and, for the most part — certainly from my perspective — it is. Data is everywhere, data sources are everywhere, and there is no limit to how many end users and stakeholders want their data delivered in a clean, crisp format. The demand is continuous.
A data architect designs those channels and understands the flow, as well as working with database designers to get the best table format.
Or perhaps you might prefer the formal definition of a data architect. Here's how Wikipedia defines it: 'A data architect is a practitioner of data architecture, a data management discipline concerned with designing, creating, deploying and managing an organization's data architecture. Data architects define how the data will be stored, consumed, integrated and managed by different data entities and IT systems, as well as any applications using or processing that data in some way.'
See? It's complicated.
On the job
A data architect has an expansive and important set of job responsibilities. Besides a strong sense of purpose and the ability to be cross-team functional, a database architect will usually be expected to tackle the following tasks.
Design data architectures: You create the tables and the flow, the way that data is delivered to customers and other end users. This involves working with designers and business intelligence (BI) specialists to sort through how data is organized and displaye.
Build databates: You should expect to build rational databases, and you could be called on to create 'no SQL' databases. A data architect must be able to develop strategies for data acquisition, archive recovery, and implementation of a database.
Upgrade older technologies: Data architects often get stuck with legacy systems, so a successful architect must be able to both understand and maintain old systems, while simultaneously overhauling and upgrading them. (You brought your magic wand, right?)
Data management: Data architects also clean and maintain the databases in their charge by removing and deleting old data.
Education and background
A data architect has precise educational and professional requirements and should possess in-depth knowledge in business and art (as it relates to IT databases). I think this role has a very creative side. A successful data architect should have a bachelor's degree in computer science, computer engineering, or a related field. A master's degree in data science can only help.
Most data architect positions also require several years of experience as a database administrator or designer, such as Proficient with MongoDB, Hadoop and other database modeling and management tools.
Data architects typically have years of experience in data design, data management and data storage, while data engineers typically have skills around Hadoop, Spark, and the open source big data ecosystem, complemented with programming skills in Java, Scala, or Python.
Data architects generally come from an IT background, with professional IT experience in related roles at a few companies or industries. Incoming hires often have limited exposure to whatever business brings them aboard. You should expect to be continually learning on the job.
You should expect to accrue at least five years of hands-on data and database experience. Most jobs will require a candidate who has good judgment and impeccable soft skills. Soft skills will be particularly important for architects who move up the ladder and start managing people and their job functions.
Certification
If you're looking for a way to get an edge, certification is a great option. Certifications measure your knowledge and skills against industry- and vendor-specific benchmarks to prove to employers that you have the right skillset.
The certification at the top of my list is Amazon Web Services (AWS) Certified Big Data - Specialty. AWS is so hot right now.
Because of the involvement of big data in the job role, the Cloudera Certified Associate (CCA) Spark and Hadoop Developer is an amazing certification.
In recent years, the Data Science Council of America (DASCA) Associate Big Data Engineer cert has risen up the ranks and I would be remiss to overlook Google Professional Data Engineer.
No matter what you choose for a certification and how far you go along the route of formal education, you will need to pursue it with passion. Set a goal, pursue it with passion and never lose sight. As always, I wish you the best and happy certifying!
Important Update: We have updated our Privacy Policy to comply with the California Consumer Privacy Act (CCPA)