Tech

Google Introduces Bigtable SQL Access and New AI-Ready Features of Spanner


Colorful images of data analysis

Eugene Mymrin/Getty Images

Google on Thursday announced a series of database and data analytics improvements to its cloud data architecture.

In this article, we’ll focus on the significant improvements to Spanner and Bigtable (two of Google’s cloud database services). These announcements significantly increase interoperability and open the door to additional AI implementations using the new features Google is introducing.

Also: Data breach costs rise 10%, but AI helps limit some damage

Wrench is Google’s global cloud database. It excels at providing global consistency (which is much harder than it looks) due to a number of time-related issues that Google has solved. It is also scalable, meaning the database can grow large and span multiple countries and regions. It is multimodal, meaning it supports media data, not just text. It is also managed through SQL (Structured Query Language) queries.

The big table It is also extremely scalable (hence the “big” in Bigtable). Its focus is on very wide columns that can be added on the fly and do not need to be defined consistently across all rows. It also has very low latency and high throughput. It has been described as a NoSQL database to date, a term used to describe non-relational databases that allow for flexible schemas and data organization.

Both of these tools support large enterprise databases. Spanner is generally a better choice for applications that use globally distributed databases that require strong, immediate consistency and complex transactions. Bigtable is better if high throughput is important. Bigtable has a form of consistency, but the propagation delay means that the data will not be consistent immediately, but will eventually be consistent.

Bigtable Announcement

Bigtable is primarily queried through API calls. One of the biggest and most transformative features announced today is SQL queries for Bigtable.

This is very important from a programming skills perspective. In a Stack Overflow Survey 2023 In terms of programming language usage, SQL ranks fourth, with 48.66% of programmers using it. There was no mention of Bigtable in the Stack Overflow survey, so I turned to LinkedIn for some perspective. A quick search for jobs containing “SQL” returned over 400,000 results. Meanwhile, a search for “Bigtable” returned 1,561 results, less than 1% of the number of SQLs.

Also: Google upgrades Search to combat deepfakes and demotes sites that post them

So while any number of people who know SQL can learn to make Bigtable API calls, SQL means the learning curve has been flattened to near zero. Nearly one in two developers can now use the new SQL interface to Bigtable to write queries whenever they need to.

There is one caveat, however: this Bigtable upgrade doesn’t support all SQL. However, Google has implemented over 100 features and promises more to come.

Also introduced in Bigtable is distributed counters. Counters are functions like sums, averages, and other related mathematical functions. Google is introducing the ability to aggregate this data in real-time at very high throughput and across multiple nodes in the Bigtable cluster, allowing them to perform analytics and aggregation functions across multiple sources simultaneously.

This allows you to do things like calculate daily engagement, find max and min values ​​from sensor measurements, etc. With Bigtable, you can implement these on very large scale projects that need fast, real-time insights and can’t support the bottlenecks that typically come from aggregating by node and then aggregating across nodes. Those are big numbers, fast.

Wrench Notice

Google has a number of big announcements about Spanner, all of which point the database engine toward providing support for AI projects. The biggest announcement is the introduction of Spanner Graph, which adds graph database capabilities to the globally distributed database functionality at Spanner’s core.

Don’t confuse “graph database” with “graph”. The term means that the nodes and connections of the database can be illustrated as a graph. If you’ve ever heard the term “social graph” in reference to Facebook, you know what a graph database is. Think of the nodes as entities, like people, places, items, etc., and the connections (also called edges) as the relationships between the entities.

For example, Facebook’s social graph of you contains all the people you have relationships with, then all the people they have relationships with, etc.

Spanner can now store and manage this type of data natively, which is big news for AI deployments. This gives AI deployments a global, highly consistent, non-local way to represent large amounts of relational information. This is powerful for traversal (pathfinding or network discovery), pattern matching (identifying groups that match a given pattern), centrality analysis (identifying which nodes are more important than others), and community detection (finding clusters of nodes that form some kind of cluster, like a neighborhood).

Also: OpenAI Launches Highly Anticipated Advanced Speech Mode, But There’s One Thing to Note

Along with graph data representation, Spanner now supports GQL (Graph Query Language), an industry-standard language for performing powerful queries on graphs. It also works with SQL, meaning developers can use both SQL and GQL in the same query. This can be a big deal for applications that need to sift through row and column data and distinguish relationships in the same query.

Google is also introducing two new search methods for Spanner: full text and vector. Full text is what most people are familiar with — the ability to search within text like articles and documents according to a certain pattern.

Vector search turns words (or even entire documents) into numbers, which are mathematical representations of the data. These are called “vectors,” and they essentially capture the intent, meaning, or essence of the original text. Queries are also converted into vectors (numerical representations), so when an application performs a search, it looks for other vectors that are mathematically similar — essentially calculating similarity.

Vectors can be very powerful because the matches no longer need to be exact. For example, an application that queries “detective novels” will know that searching for “mystery novels”, “home insurance” will also work for “property insurance”, and “desk lamp” will also work for “desk lamp”.

You can see how that kind of similarity comparison could be beneficial for AI analysis. In Spanner’s case, those similarity comparisons could be effective across data stored on different continents or server racks.

Open data for deeper insights

Based on Google Data and AI Trends Report 202452% of non-technical users surveyed are currently using AI generated to provide insights from data. Nearly two-thirds of respondents believe AI will lead to a “democratization of access to insights,” essentially allowing non-programmers to ask new questions about their data without programmers having to put that information into code. 84% believe generative AI will provide those insights faster.

I agree. I am a technical user, but when I feed ChatGPT some raw data from my server and the result is some powerful useful business analytics in minutes, without having to write a single line of code, I realized AI was a game changer for my business.

Also: The moment I realized ChatGPT Plus was a turning point for my business

Here’s the problem. According to the survey, 66% of respondents reported that at least half of their data is dark data. That means the data is out there, somewhere, but not accessible for analysis.

Some of these are related to data governance issues, some are related to data formats or lack of data formats, some are related to the fact that data cannot be represented in rows and columns, and some are related to a myriad of other issues.

Essentially, while AI systems can “democratize” access to data insights, that can only be done if the data is accessible to AI systems.

Which brings us to the relevance of Google’s announcements today. All of these features increase access to data, whether through new query mechanisms, the ability of programmers to use existing skills like SQL, the ability of large databases to represent data relationships in new ways, or the ability of search queries to find similar data. They all open up what might previously have been dark data for analysis and insight.


You can follow my daily project updates on social media. Be sure to subscribe. my weekly updateand follow me on Twitter/X at @DavidGewirtzon Facebook at Facebook.com/DavidGewirtzon Instagram at Instagram.com/DavidGewirtzand on YouTube at YouTube.com/DavidGewirtzTV.

News7f

News 7F: Update the world's latest breaking news online of the day, breaking news, politics, society today, international mainstream news .Updated news 24/7: Entertainment, Sports...at the World everyday world. Hot news, images, video clips that are updated quickly and reliably

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button