-
-
Notifications
You must be signed in to change notification settings - Fork 600
Description
Feature description
As mentioned here: custom-queries
LIMIT/OFFSET
pagination has performance issue when over 100,000 records.
In V4, we implement our custom setof function in such a complicated way.
I hope the V5 can provide some kind of framework to create optimized setof
function.
Motivating example
Graphile is very powerful, and performance is very fast.
But the setof function
is some kind of hidden bomb
.
In our recent project, development stage is good.
But after deploy to production, import 1,000,000 real data records.
The setof function
become very slow.
we have tested it, OFFSET 900,000 records limit 10, need over 15s. It is very slow and unacceptable.
to solve it, currently, we implement our custom-queries by a very complicated way.
here is out V4 solution:
Our V4 custom-queries
As the doc says: One way to solve this is to have your function apply its own internal limits and filters which can be exposed as GraphQL field arguments - if you reduce the amount of data that the function can produce (e.g. to 100 rows) then it reduces the potential cost of having this function in your schema.
The primary goal is keeping the setof function input/output no breaking change.
we implement our custom-queries in such a complicated way:
1, pass args to setof function (as graphile setof function
: first, offset, last, after, before)
CREATE OR REPLACE FUNCTION test.fn_test_setof_optimize_aaa(_filters jsonb[] DEFAULT NULL,
_orders jsonb[] DEFAULT NULL,
first int DEFAULT NULL,
"offset" int DEFAULT NULL,
last int DEFAULT NULL,
after text DEFAULT NULL,
before text DEFAULT NULL)
RETURNS test.test_aaa_page_connection -- return type is custom, not setof
2, add hook pluin to make gql
Cursor
type for after
and before
because text
type can't be recognize as GQL Cursor
type, need a pluin to do this:
module.exports = makeProcessSchemaPlugin(schema => {
const stringField = schema._typeMap.String
const fileds = schema._typeMap.Query._fields
const fnNames = Object.keys(schema._typeMap.Query._fields)
for (let fnName of fnNames) {
if (fnName.startsWith('fn')) {
const item = fileds[fnName]
const args = item.args
if (Array.isArray(args)) {
for (let arg of args) {
const type = arg.type
if (['after', 'before'].includes(arg.name)
&& typeof type === 'object' && type !== null
&& typeof stringField === 'object' && stringField !== null
&& type.constructor === stringField.constructor) {
arg.type = schema._typeMap.Cursor
}
}
}
}
}
return schema;
});
3, add setof context at pg_setting
const gql = require('graphql-tag')
pgSettings: async (req) => {
let qlParsed = gql`${gqlString}`
// recursive function
setContextOpt(gqlParsed, 'definitions[0].selectionSet.selections', [])
// ...
const fieldsToCheck = ['totalCount', 'nodes', 'edges', 'pageInfo']
//...
subSelections.forEach(subSelection => {
const subSelectionName = _.get(subSelection, 'name.value', '')
if (fieldsToCheck.includes(subSelectionName)) {
ret[`opt.setof.${underscoreName}.has${subSelectionName.charAt(0).toUpperCase() + subSelectionName.slice(1)}`] = true
}
})
//...
// pgSettings output like:
// opt.setof.fn_test_setof_optimize_aaa.hasTotalCount: true
// opt.setof.fn_test_setof_optimize_aaa.hasNodes: true
4, create setof custom return type (actually, this should be step 1)
CREATE TYPE test.type_test_aaa_edge AS (
cursor varchar,
node test.test_aaa
);
CREATE TYPE test.test_aaa_page_connection AS (
nodes test.test_aaa[],
edges test.type_test_aaa_edge[],
total_count INT,
page_info basic.type_basic_page_info
);
5, finally, optimize the custom function
IF COALESCE(CURRENT_SETTING('opt.setof.' || 'fn_test_setof_optimize_aaa' || '.hasTotalCount', TRUE), '') = 'true' THEN
-- ...
-- set total_count in test.test_aaa_page_connection
END IF;
IF COALESCE(CURRENT_SETTING('opt.setof.' || 'fn_test_setof_optimize_aaa' || '.hasNodes', TRUE), '') = 'true'
OR COALESCE(CURRENT_SETTING('opt.setof.' || 'fn_test_setof_optimize_aaa' || '.hasEdges', TRUE), '') = 'true'
OR COALESCE(CURRENT_SETTING('opt.setof.' || 'fn_test_setof_optimize_aaa' || '.hasPageInfo', TRUE), '') = 'true'
THEN
-- ...
-- use offset and first, optimize use index, two steps query, only return need records, like 10 records
-- use after and first, fast cursor location query, best performance
-- create own cursor text for each edge
-- get one more to check hasNextPage
END IF;
Also, need other optimize _orders
and _filter
. and _uniq_key
to create cursor. and Multicolumn Indexes
for performance.
After those optimization, we success reduce the function time from 15s to 50~500ms. And client no need change any code.
suggest V5 solution
No need step 1~4, just use smart tag
, only need implement step 5 pg function.
like this smart tags:
CREATE OR REPLACE FUNCTION test.fn_test_aa_bb_cc(...)
RETURNS jsonb -- include: total_count, nodes, edges, page_info
COMMENT ON FUNCTION test.fn_test_aa_bb_cc IS
E'@setofFnOpt\n@setofFnOptOutType test.test_table_kkk';
one more question
is it possible implement it by smart tag
using plugin do all stuff in V4? currently, this solution use hook plugin, pgSetting function, pg custom type, seems not elegant.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status