Database Stored Operations
Functions

Jerzy Letkowski, Ph.D.

Outline

·       Database

·       Tables

·       Function Definition

·       Function Examples

DATABASE

Some of the procedures presented in this document require the Furniture database. Use the db.sql script to recreate the database if necessary. In class, run the cdb.bat batch command in the
10-StoredOperations-Functions/Data folder.

TABLES

The following lines describe briefly the tables and fields in the updated Furniture database:

US_State
[id{pk}, state, region, division]

Customer
[cid{pk}, name, address, city, state{fkUS_State.id}, zip]

PurchaseOrder
[poid{pk}, poDate, cid{fkCustomer.cid}]

Product
[pid{pk}, description, finish, unitPrice, onHand]

OrderLine
[quantity,(poid{fkPurchaseOrder.poid}, pid{fkProduct.poi}){pk}]

Notice that token {pk} stands for a Primary Key and token {fk} represents a Foreign Key. Also notice that keyword fk is followed by the table name and column name to which the foreign key is related. For example, in table Customer, column state is a foreign key associated with primary key US_State.id which is all written as state{fkUS_State.id}.

Some database references use even shorter notation, in which the primary keys are underlined with a solid line and the foreign keys use font italics. Such a notation in more compact and is very helpful in developing SQL queries and stored operations. Here is how the Furniture database can be expressed using such a notation:

US_State
[id, state, region, division]

Customer
[cid, name, address, city, state, zip]

PurchaseOrder
[poid, poDate, cid]

Product
[pid, description, finish, unitPrice, onHand]

OrderLine
[quantity, poid, pid]

Notice that in table OrderLine, columns poid and pid are foreign keys (shown in italics) but together they serve as a composite primary key (shown underlined: poid, pid).

MySQL FUNCTION DEFINITION

A function is a named program capable to perform database operations within the database system. It returns a literal and supports typical programming structures like variables, blocks, decisions (conditional execution) and repetitions (loops).   

The following is a minimized definition of the MySQL function:

CREATE FUNCTION FUNCTION_NAME([argument_list])

RETURNS database_type
BEGIN
  function_body
END

The argument_list consists of pairs:
  arg_name, arg_data_type

The arg_name element is the name of a local [input] variable that is expected to submit some data to the function.

The arg_data_type element defined the data type of the argument (INT, FLOAT, CHAR, VARCHAR, etc.).

The function_body segment consists of SQL-procedure statements such as:
  - declaration, for example:
   
DECLARE v_name VARCHAR(64);
  - assignment, for example:
   
SET v_name = 'John';
  - SQL retrieval statement, for example:

 
SELECT name INTO v_name FROM Customer WHERE pid=5;
  FETCH cursor_name INTO variable_name;

  - control structures such as:
     
IF – THEN – ELSE - END IF
     
IF – THEN – ELSEIF – . . . – ELSE – END IF
     
CASE – WHEN – ELSE – END CASE
     
loop_id:LOOP – LEAVE loop_id – END LOOP loop_id
     
loop_id:REPEAT – UNTIL - END REPEAT loop_id
      loop_id:WHILE – DO - END WHILE loop_id.  

FUCTION EXAMPLE 1 

Purpose: Add two double-precision numbers (supplied to the function via its input arguments).

Implementation

This function does not use any data from the database but it must be part of some database. Why don’t we first create a public database (for example MYUTILS) that later can be used in other databases.

CREATE DATABASE IF NOT EXISTS MYUTILS;

USE MyUTILS;

-- Drop the function if it already exist (prefix f of the function name indicates that this is a function):

DROP FUNCTION IF EXISTS fSUMOF;

-- Set the delimiter to ~~~ (a similar feature we did in stored procedures and triggers):

 

DELIMITER ~~~

-- Now, define the function:

CREATE FUNCTION fSUMOF(x DOUBLE, y DOUBLE)

RETURNS DOUBLE

BEGIN

  RETURN x+y;

END~~~

-- Reset the delimiter back to a semi-colon:

DELIMITER ;

-- Use this function to add two numbers (for example 1.5 and 2.5):

SELECT fSUMOF(1.5, 2.5);

-- MySQL replies:

  fSUMOF(1.5, 2.5)

                 4

Signatures of functions (here SUMOF(x DOUBLE, y DOUBLE)) work much like in procedures. However, functions use IN-type arguments. This is the most important feature of functions. They are intended to receive some input and then return some output. If we wanted to have a procedure that does the same job, its signature would be:
   pGETSUM(IN x DOUBLE,IN y DOUBLE, OUT result DOUBLE).

One must declare the type of object being produced (returned) by a function. SUMOF is expected to return a DOUBLE value.

The body of this function has just one statement. It both calculates and returns the sum of the values of the two arguments, x, y.

A similar function created in Python would look as follows:

def sumOf(x, y):

    return  x + y

 

# Example

x = sumOf(1.5,2.5)

print(x)

FUNCTION EXAMPLE 2

Purpose: Show the current period of the day (midnight, night, morning, etc.).

CREATE DATABASE IF NOT EXISTS MYPROC;

USE MYPROC;

-- Drop the function if it already exists:

DROP FUNCTION IF EXISTS fDAYPERIOD;

-- Set the delimiter to ~~~ :

 

DELIMITER ~~~

-- Now, create the function:

 

CREATE FUNCTION fDAYPERIOD()

RETURNS VARCHAR(9)

BEGIN

  DECLARE t Time;

  DECLARE dp VarChar(9);

  SET t = CurTime();

  If t < '00:15:00' THEN

    SET dp = 'midnight';

  ELSEIF t < '05:00:00' THEN

    SET dp = 'night';

  ELSEIF t < '11:45:00' THEN

    SET dp = 'morning';

  ELSEIF t < '12:15:00' THEN 

    SET dp = 'noon';

  ELSEIF t < '17:00:00' THEN 

    SET dp = 'afternoon';

  ELSEIF t < '20:00:00' THEN 

    SET dp = 'evening';

  ELSEIF t < '23:45:00' THEN 

    SET dp = 'night';

  ELSE

    SET dp = 'midnight';

  END IF;

  RETURN dp;

END;

~~~

-- Reset the delimiter back to a semi-colon:

DELIMITER ;

-- Use this function:

SELECT fDAYPERIOD();

-- MySQL replies (at 5:32 p.m.):

DAYPERIOD()

 evening

Notice that function CurTime() takes the current time from the computer’s clock. It is therefore necessary to keep the clock synchronized with a common time Web service (e.g. www.time.gov, time.windows.com, etc.).

FUNCTION EXAMPLE 3

Purpose: Average them with a CURSOR and LOOP.

Implementation 1 (Using LOOP – END LOOP)

-- Create this function in database Furniture:

USE Furniture;

-- Drop the function if it already exists:

DROP FUNCTION IF EXISTS fPRICE_AVERAGE;

-- Set the delimiter to ~~~ :

 

DELIMITER ~~~

-- Now, create the function:

CREATE FUNCTION fPRICE_AVERAGE()

RETURNS FLOAT

BEGIN

  DECLARE v_n, v_flag INT;

  DECLARE v_sum, v_price FLOAT;

  DECLARE cProductPrice CURSOR FOR SELECT

     unitPrice FROM Product;

  DECLARE CONTINUE HANDLER FOR NOT FOUND SET v_flag = 1;

  SET v_n = 0;

  SET v_sum = 0.0;

  SET v_flag = 0;

  OPEN cProductPrice;

  sumLoop:LOOP

    FETCH cProductPrice INTO v_price;

    IF v_flag = 1 THEN

      LEAVE sumLoop;

    END IF;

    SET v_sum = v_sum + v_price;

    SET v_n = v_n + 1;

  END LOOP sumLoop;

  CLOSE cProductPrice;

  RETURN v_sum/v_n;

END;

~~~

-- Reset the delimiter back to a semi-colon:

DELIMITER ;

-- Use this function:

SELECT fPRICE_AVERAGE();

-- MySQL replies (at 5:32 p.m.):

  fPRICE_AVERAGE()

   372.3163757324219

Much ado about nothing! Why would anyone want to write such a  complicated function if this problem can be easily solved with a simple SQL query?

SELECT AVG(unitPrice) FROM Product;

-- MySQL replies:

AVG(unitPrice)

  372.31636463512075

The reason is more pedagogical than technical. First, the function can be easily tested by comparing its outcome to the SQL based. Second, it is a relatively simple example for using stored operation features such as CURSOR, FETCH, HANDLER and LOOP

In MySQL, a cursor is an iterative pointer for a SELECT-query result.

DECLARE cProductPrice CURSOR FOR SELECT unitPrice FROM Product;

Right after its opening (OPEN cProductPrice;), cursor cProductPrice is set to point to the first row of its query (SELECT unitPrice FROM Product) result. Each time statement FETCH cProductPrice INTO v_price; is executed, the cursor is set to point to the next row. Since the FETCH statement is executed in a loop, it will be repeated as long as variable v_flag = 0. It fetches the value of the unit price (unitPrice) and adds this value to variable v_sum and increments variable v_n (counter) by 1. As you can see, the flag is set to 0 right before the loop and it is set to 1 by a special handler (DECLARE CONTINUE HANDLER FOR NOT FOUND SET v_flag = 1;). This handler is working in the background, “watching” where the cursor is pointing to. As long as the cursor points to a row, containing data, the handler keeps its flag unchanged. As soon as the cursor moves past the last row, it no longer points to any data (NOT FOUND), the handler sets the flag to 1 (as assumed in the handler’s definition). Thus, after the FETCH statement has retrieved the value in the last row, the handler performs the assignment SET v_flag = 1, and the logical condition of the IF statement gets evaluated as TRUE, causing the Leave sumLoop; statement to be carried out. Right after the loop has been terminated, the function returns the average price (RETURN v_sum/v_n) .

The “sandwich” structure of the REPEAT loop is:

[loopTag:]LOOP

  loop_body1

  . . .

  exit_statment (e.g.an IF statement with LEAVE loopTag)

  loop_body2

  . . .

END LOOP [loopTag:];

The statements of the loop’s body get executed repeatedly until the logical expression of the IF statement, containing the loop terminating statement (LEAVE loopTag;), returns TRUE.

Now, as we understand how to loop through a record set, we can do all kinds of processing that is either hard or impossible to accomplish by [straight] SQL queries. 

Implementation 2 (Using REPEAT)

-- Create this function in database Furniture:

USE Furniture;

-- Drop the function if it already exists:

DROP FUNCTION IF EXISTS fPRICE_AVG;

-- Set the delimiter to ~~~ :

 

DELIMITER ~~~

-- Now, create the function:

CREATE FUNCTION fPRICE_AVG()

RETURNS FLOAT

BEGIN

  DECLARE v_n, v_flag INT;

  DECLARE v_sum, v_price FLOAT;

  DECLARE cProductPrice CURSOR

    FOR SELECT unitPrice FROM Product;

  DECLARE CONTINUE HANDLER FOR NOT FOUND SET v_flag = 1;

  SET v_n = 0;

  SET v_sum = 0.0;

  SET v_flag = 0;

  OPEN cProductPrice;

  REPEAT

    FETCH cProductPrice INTO v_price;

    IF v_flag = 0 THEN

      SET v_sum = v_sum + v_price;

      SET v_n = v_n + 1;

    END IF;

  UNTIL v_flag = 1 END REPEAT;

  CLOSE cProductPrice;

  RETURN v_sum / v_n;

END;

~~~

-- Reset the delimiter back to a semi-colon:

DELIMITER ;

-- Use this function:

SELECT fPRICE_AVG();

-- MySQL replies:

  fPRICE_AVG()

   372.3163

The “sandwich” structure of the REPEAT loop is:

[loopTag:]REPEAT

  loop_body

UNTIL logical_expression END REPEAT [loopTag:];

The loop’s tag it optional (if it is not used). The statements of the loop’s body get executed repeatedly until the logical expression of the UNTIL clause returns TRUE.

Implementation 3 (Using WHILE)

-- Create this function in database Furniture:

USE Furniture;

-- Drop the function if it already exists:

DROP FUNCTION IF EXISTS fAveragePrice;

-- Set the delimiter to ~~~:

 

DELIMITER ~~~

-- Now, create the function:

CREATE FUNCTION fAveragePrice()

RETURNS float

BEGIN

  DECLARE v_n, v_flag INT;

  DECLARE v_sum, v_price FLOAT;

  DECLARE cProductPrice CURSOR

    FOR SELECT unitPrice FROM Product;

  DECLARE CONTINUE HANDLER FOR NOT FOUND SET v_flag = 1;

  SET v_n = 0;

  SET v_sum = 0.0;

  SET v_flag = 0;

  OPEN cProductPrice;

  FETCH cProductPrice INTO v_price;

  WHILE v_flag = 0 DO

    SET v_sum = v_sum + v_price;

    SET v_n = v_n + 1;

    FETCH cProductPrice INTO v_price;

  END WHILE;

  CLOSE cProductPrice;

  RETURN v_sum / v_n;

END;

~~~

-- Reset the delimiter back to a semi-colon:

DELIMITER ;

-- Use this function:

SELECT fAveragePrice();

-- MySQL replies:

  fAveragePrice()

   372.316375732421

The “sandwich” structure of the WHILE loop is:

[loopTag:] WHILE logical_expression DO

  loop_body

END WHILE [loopTag:];

The loop’s tag it optional (if it is not used). The statements of the loop’s body get executed repeatedly until the logical expression of the WHILE clause returns FALSE.

Choosing any of the available loop structures (LOOP, REPEAT, WHILE) is just a matter to individual preference or taste. As you can see from the above examples, they lead to the same solution.

FUNCTION EXAMPLE 4

Purpose: Show the median unitPrice in table Product.

Unlike Excel, MySQL does not have a built-in function to calculate the median of a numeric data set (column).  This example shows how to do it via a user-defined function. Notice quite a high complexity of the SQL queries inside the function.  Do you remember the definition of the median?

Implementation 1 (Using SELECT Queries)

-- Create this function in database Furniture:

USE Furniture;

-- Drop the function if it already exists:

DROP FUNCTION IF EXISTS fMEDIANPRICE;

-- Set the delimiter to ~~~ :

 

DELIMITER ~~~

-- Now, create the function:

CREATE FUNCTION fMEDIANPRICE()

RETURNS FLOAT

BEGIN

  DECLARE v_n int;

  DECLARE v_m1 int;

  DECLARE v_m2 int;

  DECLARE v_p1 float;

  DECLARE v_p2 float;

 

  SELECT COUNT(unitPrice) INTO v_n FROM Product;

  IF MOD(v_n,2) = 0 THEN

    SET v_m1 = (v_n DIV 2);

    SET v_m2 = v_m1 + 1;

    SELECT unitPrice INTO v_p1 FROM

      (SELECT @i:=@i+1 'pos', unitPrice FROM

        Product ,(SELECT @i:=0) AS init

        ORDER BY unitPrice) AS sortedUnitPrice

      WHERE pos = v_m1;

    SELECT unitPrice INTO v_p2 FROM

      (SELECT @i:=@i+1 'pos', unitPrice FROM

        Product ,(SELECT @i:=0) AS init

        ORDER BY unitPrice) AS sortedUnitPrice

      WHERE pos = v_m2;

    RETURN (v_p1 + v_p2)/2;

  ELSE

    SET v_m1 = ((v_n+1) DIV 2);

    SELECT unitPrice INTO v_p1 FROM

      (SELECT @i:=@i+1 'pos', unitPrice FROM

        Product ,(SELECT @i:=0) init

        ORDER BY unitPrice) AS sortedUnitPrice

      WHERE pos = v_m1;

    RETURN v_p1;

  END IF;

END;

~~~

-- Reset the delimiter back to a semi-colon:

DELIMITER ;

-- Use this function:

SELECT fMEDIANPRICE();

-- MySQL replies:

  fMEDIANPRICE()

   250

Run the internal queries outside the function to find out what they do.

First find out the size of the sample:

SELECT COUNT(unitPrice) INTO @n FROM Product;
SELECT @n;

-- MySQL replies:

      @n

      11

Since the size is odd, add 1 to it and divide the sum by two to get the middle position.

SELECT (@n+1)/2;

-- MySQL replies:

   (@n+1)/2

         6

This means that the median is the 6th value. Use this result to run the following query:

SELECT unitPrice AS medianPrice

FROM (SELECT @i := @i+1 'pos', unitPrice

  FROM Product, (SELECT @i := 0) init

  ORDER BY unitPrice) tmp

WHERE pos = 6;

-- MySQL replies:

   medianPrice

         250

Try this query this way (without the WHERE clause):

Set @i=0;

Select @i:=@i+1 'pos', unitPrice

From Product Order by unitPrice;

-- MySQL replies:

    pos   unitPrice

    1     145.99

    2        175

    3        200

    4        225

    5     245.99

    6        250

    7        325

    8      328.5

    9        650

   10        750

   11        800

You can see that this query generates and an index like column! You can look up price at the position appropriate for the median. This is a very cool stuff!

Notice that this manual procedure is equivalent to the second part (ELSE) of the function’s code:

SET v_m1 = ((v_n+1) DIV 2);

SELECT unitPrice INTO v_p1 FROM

 (SELECT @i:=@i+1 'pos', unitPrice FROM

    Product ,(SELECT @i:=0) init

    ORDER BY unitPrice) AS sortedUnitPrice

WHERE pos = v_m1;

RETURN v_p1;

This part deals with an odd sample size. The decision about it is done by the IF statement:

IF MOD(v_n,2) = 0 THEN

  -- the EVEN sample size situation.

ELSE

  -- the ODD sample size situation.

END IF

Expression MOD(v_n,2) calculates the (remainder) modulus of the v_n/2 ratio. Even numbers return MOD(v_n,2)  =  0 and odd number—1. Thus it is a convenient way to find out if the sample is even or odd.

Notice that in the EVEN situation, the median is calculated as the average of two middle values. If the position of the first value is calculated as:
v_m1 = (v_n DIV 2)  then the position of the next value is
v_m2 = v_m1 + 1. Keyword DIV stands for an Integer Division Operator (for example 7 DIV 3 = 2). The two middle values are produced two queries:

SELECT unitPrice INTO v_p1 FROM

  (SELECT @i:=@i+1 'pos', unitPrice FROM

       Product ,(SELECT @i:=0) AS init

       ORDER BY unitPrice) AS sortedUnitPrice

  WHERE pos = v_m1;

SELECT unitPrice INTO v_p2 FROM

  (SELECT @i:=@i+1 'pos', unitPrice FROM

       Product ,(SELECT @i:=0) AS init

       ORDER BY unitPrice) AS sortedUnitPrice

  WHERE pos = v_m2;

Finally, the median is calculated as the average of the values returned by the queries, RETURN (v_p1 + v_p2)/2.

Implementation 2 (Using CURSOR)

-- Create this function in database Furniture:

USE Furniture;

-- Drop the function if it already exists:

DROP FUNCTION IF EXISTS fMEDPRICE;

-- Set the delimiter to ~~~ :

 

DELIMITER ~~~

-- Now, create the function:

CREATE FUNCTION fMEDPRICE()

RETURNS FLOAT

BEGIN

  DECLARE v_i, v_n, v_m1, v_m2 INT;

  DECLARE v_p1, v_p2, v_median FLOAT;

  DECLARE cProductPrice CURSOR FOR

    SELECT unitPrice FROM Product ORDER BY unitPrice;

   

  SELECT COUNT(unitPrice) INTO v_n FROM Product;

  OPEN cProductPrice;

  SET v_i = 1;

  IF MOD(v_n,2) = 0 THEN

    SET v_m1 = (v_n DIV 2);

    SET v_m2 = v_m1 + 1;       

    evenCaseLoop:LOOP

       FETCH cProductPrice INTO v_p1;

       IF v_i = v_m1 THEN

         FETCH cProductPrice INTO v_p2;

         LEAVE evenCaseLoop;

       END IF;

       SET v_i = v_i + 1;

    END LOOP evenCaseLoop;

    SET v_median = (v_p1 + v_p2)/2;

  ELSE

    SET v_m1 = ((v_n+1) DIV 2);

    oddCaseLoop:LOOP

       FETCH cProductPrice INTO v_p1;

       IF v_i = v_m1 THEN

         LEAVE oddCaseLoop;

       END IF;

       SET v_i = v_i + 1;

    END LOOP oddCaseLoop;

    SET v_median = v_p1;

  END IF;

  CLOSE cProductPrice;

  RETURN v_median;

END;

~~~

-- Reset the delimiter back to a semi-colon:

DELIMITER ;

-- Use this function:

SELECT fMEDPRICE();

-- MySQL replies:

  fMEDPRICE()

   250

For odd samples the idea behind this algorithm is to fetch and count values of the unit prices until the value at the median’s position is encountered. A similar approach is used for even samples except two middle values are fetched.

Notice that in this example no handler is used. Since the cursor never arrives at the end of its query result set, the handler is not needed.

CHALLENGE   

Develop a function that will return the largest difference between consecutive values (v) of table ValueList:

 

DROP DATABASE IF EXISTS Cool;

CREATE DATABASE Cool;

USE Cool;

CREATE TABLE ValueList (

  k INT,

  v INT

);

INSERT INTO ValueList(k,v) VALUES(1,9),(2,7),(3,6),(4,4),(4,1);

SELECT k,v FROM ValueList;

    k     v   

    1     9

    2     7

    3     6

    4     4

    4     1

Such a function should return 3 for the above table.

REFERENCES

[1] MySQL Workbench, Data Modeling & Development by Michael McLaughlin. Chapter 5, Creating and Managing Routines (p. 159 – 172)

[2] Database Journal - MySQL Cursors

[3] Database Journal - MySQL Stored Functions

[4] MySQL Functions by Guru99