Load/use SQL function on Python (avoid round trip)
The Python code I'm writing makes a psycopg2 connection to a PostgreSQL database. I need to build some reports out of the data in this database, so I have a couple Python procs that periodically run and create a csv file out of some tables and nice queries.
The problem I'm facing here is that I need to include a column in my csv report which is the result of a function stored in the PostgreSQL database. This database is managed by another group of persons so I can't write to it. I could easily see the content of that function and emulate the behaviour on Python and have the column values I need to calculate, but in this case this function is periodically changing and it won't make sense to continually update the Python function.
So my question here is if it's possible to somehow load the database function into Python code, every time my code connects to the database. I could make actual use of the function on the database itself, but imagine making 900K calls to a database function from Python code to calculate a value, simply does not scale.
edit: Adding sql function
CREATE OR REPLACE FUNCTION public.p_start(integer, integer) RETURNS numeric LANGUAGE sql IMMUTABLE AS $function$ SELECT CASE WHEN $1 = 0 AND $2 = 0 THEN 0.2760 WHEN $1 = 0 AND $2 = 1 THEN 0.0684 WHEN $1 = 0 AND $2 = 2 THEN 0.0277 WHEN $1 = 0 AND $2 = 3 THEN 0.0189 WHEN $1 = 0 AND $2 = 4 THEN 0.0038 WHEN $1 = 0 AND $2 = 5 THEN 0.0098 WHEN $1 = 1 AND $2 = 1 THEN 0.5501 WHEN $1 = 1 AND $2 = 2 THEN 0.2264 WHEN $1 = 1 AND $2 = 3 THEN 0.1203 WHEN $1 = 1 AND $2 = 4 THEN 0.0804 WHEN $1 = 1 AND $2 = 5 THEN 0.0839 ELSE 0.1 END; $function$
It's a plain sql-language function.
Unless you can rely on its structure remaining stable enough that you can write a simple parser for the function body, fetching its source from the prosrc column of pg_proc, I think you're out of luck. I wouldn't want to do this, as it'd be extremely fragile.
You should ask the database team to store this mapping in a small table instead of a function, then you can just query the table contents and cache them. They can still have a function that returns the result of a select from the table, and it'll get inlined efficiently in most cases, and it won't affect compatibility with existing apps that use the function. E.g. if the info was in a table probabilities the function would be:
CREATE OR REPLACE FUNCTION public.p_start(integer, integer) RETURNS numeric LANGUAGE sql IMMUTABLE AS $function$ SELECT coalesce( SELECT probability FROM probabilities WHERE a = $1 and b = $2 0.1) $function$
Failing that, if the range of possible input values is known in advance, you can generate a map of the full function domain and cache it in your app with something like:
test=> SELECT startval, endval, p FROM generate_series(0,1) startval cross join generate_series(0,5) endval cross join p_start(startval, endval) p; startval | endval | p ----------+--------+-------- 0 | 0 | 0.2760 1 | 0 | 0.1 0 | 1 | 0.0684 1 | 1 | 0.5501 0 | 2 | 0.0277 1 | 2 | 0.2264 0 | 3 | 0.0189 1 | 3 | 0.1203 0 | 4 | 0.0038 1 | 4 | 0.0804 0 | 5 | 0.0098 1 | 5 | 0.0839 (12 rows)
which is trivial to turn into a Python dictionary for lookups locally.
This only works if the function's domain is finite and known.
BTW, it's wrong to define it as IMMUTABLE then re-define it, unless they DROP and re-CREATE it to ensure nothing relies on the old definition. It should be declared STABLE. Re-defining an IMMUTABLE function in a way that can produce different values will result in incorrect query results if the function is used in any expression indexes.