Postgresql cross join

8/13/2023

The best description in the documentation comes at the bottom of the list of FROM clause options : Interested in learning more about Heap Engineering? Meet our team to get a feel for what it’s like to work at Heap! What is a LATERAL join? In this post, I’ll walk through a conversion funnel analysis that wouldn’t be possible in PostgreSQL 9.2. I’m going to train my coworkers on the proper use of LATERAL JOINs as they seem pretty easy and they save a lot of cpu time.PostgreSQL 9.3 has a new join type! Lateral joins arrived without a lot of fanfare, but they enable some powerful new queries that were previously only tractable with procedural code. It’s new to PostgreSQL 9.3, so it should be in the majority of servers. Placing it exactly in the middle allows me to specify that I need facturascli rows, but not lineasfacturascli ones, at the moment of calling the function. ¿What does a LATERAL JOIN? It allows a subquery JOIN to refer to fields defined on a previous JOIN. Also the query plan was very straightforward, and did what I wanted, do X for every row in facturascli table. So I started to learn LATERAL JOIN queries, and managed to test something like:ĬROSS JOIN LATERAL observaciones(dcliente, dserie)Īnd then, it worked like a charm! I placed 4 lateral joins in total, and the query timing was 2 seconds!. And the timming was better, but not close to the 1.2 seconds I wanted. I thought I would be forcing PostgreSQL to call it once per invoice, but the plan doesn’t reflected this. INNER JOIN lineasfacturascli USING(idfactura) SELECT *, observaciones(dcliente, dserie) FROM facturascli ) facturascli Then I tried to move the function query into a subquery, so it could only see records from invoices, and then joining those results to the main query. Optimizations using volatility appear when you pass static “inplace” variables to it. I tried to change the function volatility from VOLATILE to INMUTABLE in the hopes it would help the parser call it less times. Looking at the plan, as it is presented by the pgadmin3 tool, it was pretty clear it was doing it in every record.

So the 99% of the time came from these functions. I removed manually these functions to compare times, and I got 1.2 seconds. This wasn’t one of such cases, obviously. But because 95% of the time PostgreSQL manages to plan like it was a JOIN (it does some kind of transformations on its own), most of the time the performance penalty is zero. The donwside of abusing subqueries on SELECT is, performance downgrade.

Several times we had a “row multiplication” problem, when a JOIN was doing something unwanted and we got more rows than we expected. That’s because I prefer a query to be slow rather than to be wrong. I usually say to my coworkers that to add extra data to the query is better to fit subqueries to the select rather to adding joins. So that’s were the problem was coming from. And, because a function is a blackbox to PostgreSQL, the server ends up calling it in every output row despite that is useless. The problem is, this SQL generates several rows per invoice. (The function we actually use is a bit larger than that)Īs you can guess from the parameters, it requires one function call for each customer invoice (facturascli). In the case of this customer, he had a version where it had 3 calls to the function in the SQL body. This function is something we designed to let the user add texts to every invoice, using different programable rules. The problem was in the selected fields, where there was a PGSQL function: As we use Eneboo ERP (which is Free Source, GPLv2) the tables and fields are in spanish, and their names are common to any free user.īut the problem wasn’t there.

These SQL are a bit complicated since they currently have 17 tables in the FROM section. I started PgAdmin3 and copied the SQL from the PostgreSQL log to it. Of course, the server also is a cheap one, but anyways that doesn’t explain that bad timing.Īs we work in PostgreSQL database, this server has installed 9.4 version, used actively in production. What’s interesting here is that this database doesn’t hold a specially high amount of data compared to other customers, so I was very suprised of this sql timings. In the process, I’ve found they were trying to print every invoice of the last year 2016, and some fault on the SQL was making the query incredibly slow: 880 seconds to complete. Today I was researching problems on one customer database server because it worked a bit slowly in the last 6 months and we don’t know exactly why. The opinions stated in this blog are my own, not those of my company or any previous employer.

0 Comments

Postgresql cross join

Leave a Reply.

Author

Archives

Categories