Quantcast
Channel: Blog dbi services » Franck Pachot
Viewing all 331 articles
Browse latest View live

Oracle 18c preinstall RPM on RedHat RHEL

$
0
0

By Franck Pachot

.
The Linux prerequisites for Oracle Database are all documented but using the pre-install rpm makes all things easier. Before 18c, this was easy on Oracle Enterprise Linux (OEL) but not so easy on RedHat (RHEL) where the .rpm had many dependencies on OEL and UEK.
Now that 18c is there to download, there’s also the 18c preinstall rpm and the good news is that it can be run also on RHEL without modification.

This came to my attention on Twitter:

And of course this is fully documented:
https://docs.oracle.com/en/database/oracle/oracle-database/18/cwlin/about-the-oracle-preinstallation-rpm.html#GUID-C15A642B-534D-4E4A-BDE8-6DC7772AA9C8

In order to test it I’ve created quickly a CentOS instance on the Oracle Cloud:
CaptureCentosPreinstall

I’ve downloaded the RPM from the OEL7 repository:

[root@instance-20180803-1152 opc]# curl -o oracle-database-preinstall-18c-1.0-1.el7.x86_64.rpm https ://yum.oracle.com/repo/OracleLinux/OL7/latest/x86_64/getPackage/oracle-database-preinstall-18c-1.0-1 .el7.x86_64.rpm
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 18244 100 18244 0 0 63849 0 --:--:-- --:--:-- --:--:-- 63790

then ran the installation:

[root@instance-20180803-1152 opc]# yum -y localinstall oracle-database-preinstall-18c-1.0-1.el7.x86_ 64.rpm

 
It installs automatically all dependencies:
Installed:
oracle-database-preinstall-18c.x86_64 0:1.0-1.el7
 
Dependency Installed:
compat-libcap1.x86_64 0:1.10-7.el7 compat-libstdc++-33.x86_64 0:3.2.3-72.el7 glibc-devel.x86_64 0:2.17-222.el7 glibc-headers.x86_64 0:2.17-222.el7
gssproxy.x86_64 0:0.7.0-17.el7 kernel-headers.x86_64 0:3.10.0-862.9.1.el7 keyutils.x86_64 0:1.5.8-3.el7 ksh.x86_64 0:20120801-137.el7
libICE.x86_64 0:1.0.9-9.el7 libSM.x86_64 0:1.2.2-2.el7 libXext.x86_64 0:1.3.3-3.el7 libXi.x86_64 0:1.7.9-1.el7
libXinerama.x86_64 0:1.1.3-2.1.el7 libXmu.x86_64 0:1.1.2-2.el7 libXrandr.x86_64 0:1.5.1-2.el7 libXrender.x86_64 0:0.9.10-1.el7
libXt.x86_64 0:1.1.5-3.el7 libXtst.x86_64 0:1.2.3-1.el7 libXv.x86_64 0:1.0.11-1.el7 libXxf86dga.x86_64 0:1.1.4-2.1.el7
libXxf86misc.x86_64 0:1.0.3-7.1.el7 libXxf86vm.x86_64 0:1.1.4-1.el7 libaio-devel.x86_64 0:0.3.109-13.el7 libbasicobjects.x86_64 0:0.1.1-29.el7
libcollection.x86_64 0:0.7.0-29.el7 libdmx.x86_64 0:1.1.3-3.el7 libevent.x86_64 0:2.0.21-4.el7 libini_config.x86_64 0:1.3.1-29.el7
libnfsidmap.x86_64 0:0.25-19.el7 libpath_utils.x86_64 0:0.2.1-29.el7 libref_array.x86_64 0:0.1.5-29.el7 libstdc++-devel.x86_64 0:4.8.5-28.el7_5.1
libverto-libevent.x86_64 0:0.2.5-4.el7 nfs-utils.x86_64 1:1.3.0-0.54.el7 psmisc.x86_64 0:22.20-15.el7 xorg-x11-utils.x86_64 0:7.5-22.el7
xorg-x11-xauth.x86_64 1:1.0.9-1.el7

Note that the limits are stored in limits.d which has priority over limits.conf:

[root@instance-20180803-1152 opc]# cat /etc/security/limits.d/oracle-database-preinstall-18c.conf
 
# oracle-database-preinstall-18c setting for nofile soft limit is 1024
oracle soft nofile 1024
 
# oracle-database-preinstall-18c setting for nofile hard limit is 65536
oracle hard nofile 65536
 
# oracle-database-preinstall-18c setting for nproc soft limit is 16384
# refer orabug15971421 for more info.
oracle soft nproc 16384
 
# oracle-database-preinstall-18c setting for nproc hard limit is 16384
oracle hard nproc 16384
 
# oracle-database-preinstall-18c setting for stack soft limit is 10240KB
oracle soft stack 10240
 
# oracle-database-preinstall-18c setting for stack hard limit is 32768KB
oracle hard stack 32768
 
# oracle-database-preinstall-18c setting for memlock hard limit is maximum of 128GB on x86_64 or 3GB on x86 OR 90 % of RAM
oracle hard memlock 134217728
 
# oracle-database-preinstall-18c setting for memlock soft limit is maximum of 128GB on x86_64 or 3GB on x86 OR 90% of RAM
oracle soft memlock 134217728

Note that memlock is set to 128GB here but can be higher on machines with huge RAM (up to 90% of RAM)

And for information, here is what is set in /etc/sysctl.conf:

fs.file-max = 6815744
kernel.sem = 250 32000 100 128
kernel.shmmni = 4096
kernel.shmall = 1073741824
kernel.shmmax = 4398046511104
kernel.panic_on_oops = 1
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
net.ipv4.conf.all.rp_filter = 2
net.ipv4.conf.default.rp_filter = 2
fs.aio-max-nr = 1048576
net.ipv4.ip_local_port_range = 9000 65500

Besides that, the preinstall rpm disables NUMA and transparent huge pages (as boot options in GRUB). It creates the oracle user (id 54321 and belonging to groups oinstall,dba,oper,backupdba,dgdba,kmdba,racdba)

 

Cet article Oracle 18c preinstall RPM on RedHat RHEL est apparu en premier sur Blog dbi services.


How much free space can be reclaimed from a segment?

$
0
0

You have the feeling that your table takes more blocks than it should? Here are the queries I use to quickly check the free space. The idea is to call DBMS_SPACE.SPACE_USAGE and infer the minimum space from the percentages. For example, a block in FS3 (defined as having at least 50 to 75% free space) is supposed to have at least 50% of free space. Of course it can have more, but you don’t know.

Here is some PL/SQL to do so:

set serveroutput on
declare
unf number; unfb number; fs1 number; fs1b number; fs2 number; fs2b number; fs3 number; fs3b number; fs4 number; fs4b number; full number; fullb number;
begin
for i in (select * from (select * from dba_segments where segment_subtype='ASSM' and segment_type in (
'TABLE','TABLE PARTITION','TABLE SUBPARTITION','CLUSTER','LOB','LOB PARTITION','LOB SUBPARTITION'
) order by bytes desc) where 10>=rownum)
loop
begin
dbms_space.space_usage(i.owner,i.segment_name,i.segment_type,unf,unfb,fs1,fs1b,fs2,fs2b,fs3,fs3b,fs4,fs4b,full,fullb,partition_name=>i.partition_name);
dbms_output.put_line(to_char((unfb+fs1b+fs2b*0.25+fs3b*0.5+fs4b*0.75)/1024/1024/1024,'999G999D999')||' GB free in '||i.segment_type||' "'||i.owner||'"."'||i.segment_name||'" partition "'||i.partition_name||'"');
exception
when others then dbms_output.put_line(i.segment_type||' "'||i.owner||'"."'||i.segment_name||'" partition "'||i.partition_name||'": '||sqlerrm);
end;
end loop;
end;
/

The output looks like:

.001 GB free in INDEX "DEMO"."ACCOUNT_PK" partition ""
.001 GB free in TABLE "APEX_040200"."WWV_FLOW_PAGE_PLUGS" partition ""
.009 GB free in TABLE "SCOTT"."DEMO" partition ""
.000 GB free in TABLE "APEX_040200"."WWV_FLOW_STEP_ITEMS" partition ""
.003 GB free in INDEX "SYS"."WRH$_SYSMETRIC_HISTORY_INDEX" partition ""
.000 GB free in TABLE "MDSYS"."SDO_CS_SRS" partition ""
.002 GB free in INDEX "SYS"."I_WRI$_OPTSTAT_H_OBJ#_ICOL#_ST" partition ""
.006 GB free in TABLE "SYS"."WRH$_SYSMETRIC_HISTORY" partition ""
.002 GB free in TABLE "SYS"."WRH$_SQL_PLAN" partition ""

If you are in 12c, an inline function in the query might come handy:

with function freebytes(segment_owner varchar2, segment_name varchar2, segment_type varchar2,partition_name varchar2) return number as
unf number; unfb number; fs1 number; fs1b number; fs2 number; fs2b number; fs3 number; fs3b number; fs4 number; fs4b number; full number; fullb number;
begin
dbms_space.space_usage(segment_owner,segment_name,segment_type,unf,unfb,fs1,fs1b,fs2,fs2b,fs3,fs3b,fs4,fs4b,full,fullb,partition_name=>partition_name);
return unfb+fs1b+fs2b*0.25+fs3b*0.5+fs4b*0.75;
end;
select round(freebytes(owner,segment_name,segment_type,partition_name)/1024/1024/1024,3) free_GB,segment_type,owner,segment_name,partition_name
from dba_segments where segment_subtype='ASSM' and segment_type in (
'TABLE','TABLE PARTITION','TABLE SUBPARTITION','CLUSTER','LOB','LOB PARTITION','LOB SUBPARTITION'
) order by bytes desc fetch first 10 rows only
/

The result looks like:

FREE_GB SEGMENT_TYPE OWNER SEGMENT_NAME PARTITION_NAME
------- ------------ ----- ------------ --------------
0 TABLE DEMO ACCOUNTS
0.001 INDEX DEMO ACCOUNT_PK
0.001 TABLE APEX_040200 WWV_FLOW_PAGE_PLUGS
0.009 TABLE SCOTT DEMO
0.003 INDEX SYS WRH$_SYSMETRIC_HISTORY_INDEX
0 TABLE APEX_040200 WWV_FLOW_STEP_ITEMS
0.002 INDEX SYS I_WRI$_OPTSTAT_H_OBJ#_ICOL#_ST
0 TABLE MDSYS SDO_CS_SRS
0.006 TABLE SYS WRH$_SYSMETRIC_HISTORY
0.002 TABLE SYS WRH$_SQL_PLAN

Future evolution will be published on GitHub:
https://raw.githubusercontent.com/FranckPachot/scripts/master/administration/segment_free_space_plsql.sql
https://raw.githubusercontent.com/FranckPachot/scripts/master/administration/segment_free_space_sql.sql

Note that having free space does not mean that you have to shrink or reorg. Try to understand what happened to your data before, and whether this space will be reused soon.

Update 8-AUG-2018

In the initial post I added all segment types accepted by the dbms_space documentation but finally removed ‘INDEX’,’INDEX PARTITION’,’INDEX SUBPARTITION’ because the meaning of the output is completely different. See Jonathan Lewis note about it: https://jonathanlewis.wordpress.com/2013/12/17/dbms_space_usage/

 

Cet article How much free space can be reclaimed from a segment? est apparu en premier sur Blog dbi services.

MERGE JOIN CARTESIAN: a join method or a join type?

$
0
0

I’ll present about join methods at POUG and DOAG. I’ll show how the different join methods work in order to better understand them. The idea is to show Nested Loops, Hash Join, Sort Merge Join, Merge Join Cartesian on the same query. I’ll run a simple join between DEPT and EMP with the USE_NL, USE_HASH, USE_MERGE and USE_MERGE_CARTESIAN hints. I’ll show the execution plan, with SQL Monitoring in text mode. And I’ll put some gdb breakpoints on the ‘qer’ (query execution rowsource) functions to run the plan operations step by step. Then I’ll do the same on a different query in order to show in detail the 12c adaptive plans.

But wait, I listed Nested Loops, Hash Join, Sort Merge Join, Merge Join Cartesian… but is Merge Cartesian Join really a join method? I mean, my query is not a cartesian join. I have all join predicates here. But for sure you can also do an inner join by starting with a cartesian join and then filter on the join predicate. As if doing physically what the old join syntax of Oracle is doing logically: by not putting any predicates in the from clause and add the join predicates in the where clause to filter over it.

If I look at the 12.2 documentation, it is a Join method
CaptureJoinMethods122

For the definition, a Join Method is how the join will be executed. It is not a decision of the SQL developer because SQL is declarative: you declare the result you want, and the optimizer will decide how to do it. And this is why hints are in comments: they are not part of the declarative syntax. Forcing how to do it is not part of SQL.

Just after listing the join methods, the documentation lists the join types which are part of the SQL because it declares the join result you expect. Inner join to get all matching rows. Semi join to get only the first matching row. Anti Join to get all rows which do not match. Outer join to get all matching rows in addition to those which matches. The syntax is INNER JOIN, OUTER JOIN, EXISTS or IN, NOT EXISTS or NOT IN. Join type is not ‘how’ but ‘what’.

Ok, so back to the join method. Let’s force it on my inner join between DEPT and EMP:

11g


SQL> alter session set current_schema=SCOTT statistics_level=all;
Session altered.
 
SQL> select /*+ leading(DEPT) USE_MERGE_CARTESIAN(EMP) FULL(DEPT) */ * from DEPT join EMP using(deptno);
 
DEPTNO DNAME LOC EMPNO ENAME JOB MGR HIREDATE SAL COMM
---------- -------------- ------------- ---------- ---------- --------- ---------- --------- ---------- ----------
10 ACCOUNTING NEW YORK 7782 CLARK MANAGER 7839 09-JUN-81 2450
10 ACCOUNTING NEW YORK 7839 KING PRESIDENT 17-NOV-81 5000
10 ACCOUNTING NEW YORK 7934 MILLER CLERK 7782 23-JAN-82 1300
10 RESEARCH DALLAS 7782 CLARK MANAGER 7839 09-JUN-81 2450
10 RESEARCH DALLAS 7839 KING PRESIDENT 17-NOV-81 5000
10 RESEARCH DALLAS 7934 MILLER CLERK 7782 23-JAN-82 1300
10 SALES CHICAGO 7782 CLARK MANAGER 7839 09-JUN-81 2450
10 SALES CHICAGO 7839 KING PRESIDENT 17-NOV-81 5000
10 SALES CHICAGO 7934 MILLER CLERK 7782 23-JAN-82 1300
10 OPERATIONS BOSTON 7782 CLARK MANAGER 7839 09-JUN-81 2450
10 OPERATIONS BOSTON 7839 KING PRESIDENT 17-NOV-81 5000
10 OPERATIONS BOSTON 7934 MILLER CLERK 7782 23-JAN-82 1300
 
12 rows selected.
 
SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------
SQL_ID 1xpfxq6pc30vq, child number 0
-------------------------------------
select /*+ leading(DEPT) USE_MERGE_CARTESIAN(EMP) FULL(DEPT) */ * from
DEPT join EMP using(deptno)
 
Plan hash value: 2034389985
 
------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 12 |00:00:00.01 | 7 | | | |
| 1 | MERGE JOIN CARTESIAN| | 1 | 14 | 12 |00:00:00.01 | 7 | | | |
| 2 | TABLE ACCESS FULL | DEPT | 1 | 4 | 4 |00:00:00.01 | 4 | | | |
| 3 | BUFFER SORT | | 4 | 4 | 12 |00:00:00.01 | 3 | 2048 | 2048 | 2048 (0)|
|* 4 | TABLE ACCESS FULL | EMP | 1 | 4 | 3 |00:00:00.01 | 3 | | | |
------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
4 - filter("DEPT"."DEPTNO"="EMP"."DEPTNO")
 

Ok, then I declared my result with an inner join query, and I forced the join method with a hint to show that it is possible. But look at the result. 12 rows? Only DEPTNO 10 where the SCOTT schema has employees in 10, 20 and 30? And only 3 employees here, repeated 4 times for each department name? That’s wrong result.

NEVER FORCE A CARTESIAN JOIN WITH USE_MERGE_CARTESIAN!

That’s a very old bug: Bug 17064391 Wrong result with USE_MERGE_CARTESIAN hint finally fixed in 12c (12.2 and backported in 12.1 PSU)

Then how is it fixed?

18c

With the fix, the hint is just ignored and a SORT MERGE JOIN is used here:

SQL> alter session set current_schema=SCOTT statistics_level=all;
Session altered.
 
SQL> select /*+ leading(DEPT) USE_MERGE_CARTESIAN(EMP) FULL(DEPT) */ * from DEPT join EMP using(deptno);
 
DEPTNO DNAME LOC EMPNO ENAME JOB MGR HIREDATE SAL COMM
---------- -------------- ------------- ---------- ---------- --------- ---------- --------- ---------- ----------
10 ACCOUNTING NEW YORK 7782 CLARK MANAGER 7839 09-JUN-81 2450
10 ACCOUNTING NEW YORK 7839 KING PRESIDENT 17-NOV-81 5000
10 ACCOUNTING NEW YORK 7934 MILLER CLERK 7782 23-JAN-82 1300
20 RESEARCH DALLAS 7566 JONES MANAGER 7839 02-APR-81 2975
20 RESEARCH DALLAS 7902 FORD ANALYST 7566 03-DEC-81 3000
20 RESEARCH DALLAS 7876 ADAMS CLERK 7788 23-MAY-87 1100
20 RESEARCH DALLAS 7369 SMITH CLERK 7902 17-DEC-80 800
20 RESEARCH DALLAS 7788 SCOTT ANALYST 7566 19-APR-87 3000
30 SALES CHICAGO 7521 WARD SALESMAN 7698 22-FEB-81 1250 500
30 SALES CHICAGO 7844 TURNER SALESMAN 7698 08-SEP-81 1500 0
30 SALES CHICAGO 7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300
30 SALES CHICAGO 7900 JAMES CLERK 7698 03-DEC-81 950
30 SALES CHICAGO 7698 BLAKE MANAGER 7839 01-MAY-81 2850
30 SALES CHICAGO 7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400
 
14 rows selected.
 
SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------
SQL_ID 1xpfxq6pc30vq, child number 0
-------------------------------------
select /*+ leading(DEPT) USE_MERGE_CARTESIAN(EMP) FULL(DEPT) */ * from
DEPT join EMP using(deptno)
 
Plan hash value: 1407029907
 
--------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 14 |00:00:00.01 | 12 | 12 | | | |
| 1 | MERGE JOIN | | 1 | 14 | 14 |00:00:00.01 | 12 | 12 | | | |
| 2 | SORT JOIN | | 1 | 4 | 4 |00:00:00.01 | 6 | 6 | 2048 | 2048 | 2048 (0)|
| 3 | TABLE ACCESS FULL| DEPT | 1 | 4 | 4 |00:00:00.01 | 6 | 6 | | | |
|* 4 | SORT JOIN | | 4 | 14 | 14 |00:00:00.01 | 6 | 6 | 2048 | 2048 | 2048 (0)|
| 5 | TABLE ACCESS FULL| EMP | 1 | 14 | 14 |00:00:00.01 | 6 | 6 | | | |
--------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
4 - access("DEPT"."DEPTNO"="EMP"."DEPTNO")
filter("DEPT"."DEPTNO"="EMP"."DEPTNO")

So here the result is good, thanks to the fix, and we clearly see how it is fixed: the USE_MERGE_CARTESIAN hint has been ignored.

And the funny thing is that when you look at the 18c documentation, the Merge Join Cartesian is not a join method anymore but a join type:
CaptureJoinMethods183

Exactly the same paragraph, but now in join types (the ‘what’) rather than in join methods (the ‘when’).

What or How?

Actually, in my opinion, it is both. When you explicitly want a cartesian join, that’s a join type described by the CROSS JOIN in the ANSI join syntax, or the lack of related predicates in the old syntax. This is ‘what’. But you may also encounter a MERGE JOIN CARTESIAN for a non-cartesian join just because the optimizer decides it is more efficient. When you have very few rows on both sides, it may be faster to start with a cartesian product on small rowsources. This can be part of star transformation where fact rows are joined back to the cartesian product of filtered dimensions in order to project the dimension attributes. This is ‘how’ it will be executed. We also see it when the optimizer underestimates the cardinalities and is followed by a long nested loop.

When?

So, let’s look at the documentation “When the Optimizer Considers Cartesian Joins”:
CaptureJoinMethods183-when

  • No join condition exists: that’s when cartesian product is what we want
  • The ORDERED hint specifies a table before its join table is specified: that’s when it is the only join method possible with the specified join order
  • A Cartesian join is an efficient method: then it is a method there, even if documented in join types.

In conclusion, cartesian join is a join type. It can also be used as a join method when the optimizer decides to. But you cannot decide it yourself by hinting since 12c, and trying to do so in previous version is a very bad idea and can returns wrong results.

So, for this one I’ll explicitely run a CROSS JOIN:
CaptureUSEMERGECARTESIAN

The query is on top. The SQL monitor in the middle, showing that we are currently active on reading rows from EMP. The bottom shows the ‘qer’ functions backtrace: the fetch call is propagated from opifch2 for the SELECT STATEMENT, through the MERGE JOIN CARTESIAN (querjo), the BUFFER SORT (qerso), to the TABLE ACCESS (qertb).

So basically, the goal of this full-demo presentation is to show how to read the execution plan by understanding how it is executed. This qertbFetch on the inner table EMP is executed only on the first row coming from the outer table DEPT. As the rows are returned to a buffer, the further iterations will fetch only from this buffer and will not go further than qersoFetchSimple. The qersoProcessULS (‘process underlying row source’ – see Frits Hoogland annotations) is run only once. This is the big difference with Nested Loop where the inner loop on the underlying rowsource is run for each outer loop iteration: those two loops are nested – thus the name. But the function for the join part is the same for Nested Loop, Sort Merge Join and Merge Join Cartesian: qerjo. Only the underlying operations differenciate the join methods.

Last comment, we don’t see any function which really sort the rows in this buffer (as we will see for the Sort Merge Join method) because there is no sorting despites the name of the BUFFER SORT operation. More info on Jonathan Lewis blog.

 

Cet article MERGE JOIN CARTESIAN: a join method or a join type? est apparu en premier sur Blog dbi services.

ATP vs ADW – the Autonomous Database lockdown profiles

$
0
0

The Oracle database has always distinguished two types of workloads: transactional (OLTP) and datawarehouse (VLDB, DWH, DSS, BI, analytics). There is the same idea in the managed Oracle Cloud with two autonomous database services.

To show how this is old, here is how they were defined in the Oracle7 Tuning Book:

CaptureOLTPvsDSS

The definition has not changed a lot. But the technology behind DSS/DWH has improved. Now, with In-Memory Column Store, Smart Scan, Result Cache we can even see that indexes, materialized views, star transformation, hints,.. are disabled in the Autonomous Datawarehouse cloud service.

The difference between the two autonomous cloud services, ATP (Autonomous Transaction Processing) for OLTP and ADW (Autonomous Datawarehouse) for analytics have been described by Maria Colgan after Larry Ellison announce:
https://sqlmaria.com/2018/08/07/how-does-autonomous-transaction-processing-differ-from-the-autonomous-data-warehouse/

PDBaaS

Those autonomous services are PDB as a Service. They are using the consolidation and isolation features of 12cR2 multitenant. And we can even see that the ATP and ADW services can run within the same CDB and same instance. They are different PDBs and they differ only by their resource management plans and PDB lockdown profiles. So let’s see the differences from this point of view: ATP lockdown profile is called ‘OLTP’ and ADW lockdown profile is called ‘DWCS’.

Options

The only difference is about partitioning which is enabled for ATP and disabled for ASW

SQL> select (select value from v$parameter where name='pdb_lockdown') type,v$lockdown_rules.* from v$lockdown_rules where rule_type='OPTION';
 
TYPE RULE_TYPE RULE CLAUSE CLAUSE_OPTION STATUS USERS CON_ID
---- --------- ---- ------ ------------- ------ ----- ------
OLTP OPTION PARTITIONING ENABLE ALL 284
 
SQL> select (select value from v$parameter where name='pdb_lockdown') type,v$lockdown_rules.* from v$lockdown_rules where rule_type='OPTION' order by 1, 2 ,3 nulls first, 4 nulls first,5;
 
TYPE RULE_TYPE RULE CLAUSE CLAUSE_OPTION STATUS USERS CON_ID
---- --------- ---- ------ ------------- ------- ----- ------
DWCS OPTION PARTITIONING DISABLE ALL 73

Features

All disabled features are the same:
Disabled for all users: COMMON_SCHEMA_ACCESS, CONNECTIONS, CTX_LOGGING, NETWORK_ACCESS, OS_ACCESS, SYSTEM_DATA
Disabled for local users only: USER_PASSWORD_VERIFIERS
Enabled for all users: COMMON_USER_CONNECT, TRACE_VIEW_ACCESS, UTL_FILE

Those last ones are good news. We can query V$DIAG_OPT_TRACE_RECORDS, V$DIAG_SQL_TRACE_RECORDS, $DIAG_TRACE_FILE_CONTENTS to show some diagnostics. Unfortunately, I’ve seen no allowed ways to enable SQL Trace.

DDL Statements

In the ADW service, the local user cannot create indexes and materialized views, They are supposed to be autonomously created when required (probably by a common user):

SQL> select (select value from v$parameter where name='pdb_lockdown') type,v$lockdown_rules.* from v$lockdown_rules where regexp_like(rule,'(INDEX|MATERIALIZED)') order by 1, 2 ,3 nulls first, 4 nulls first,5;
 
TYPE RULE_TYPE RULE CLAUSE CLAUSE_OPTION STATUS USERS CON_ID
---- --------- ---- ------ ------------- ------ ----- ------
DWCS STATEMENT ALTER INDEX DISABLE LOCAL 73
DWCS STATEMENT ALTER INDEXTYPE DISABLE LOCAL 73
DWCS STATEMENT ALTER MATERIALIZED VIEW DISABLE LOCAL 73
DWCS STATEMENT ALTER MATERIALIZED VIEW LOG DISABLE LOCAL 73
DWCS STATEMENT CREATE INDEX DISABLE LOCAL 73
DWCS STATEMENT CREATE INDEXTYPE DISABLE LOCAL 73
DWCS STATEMENT CREATE MATERIALIZED VIEW DISABLE LOCAL 73
DWCS STATEMENT CREATE MATERIALIZED VIEW LOG DISABLE LOCAL 73
DWCS STATEMENT DROP INDEX DISABLE LOCAL 73
DWCS STATEMENT DROP INDEXTYPE DISABLE LOCAL 73
DWCS STATEMENT DROP MATERIALIZED VIEW DISABLE LOCAL 73
DWCS STATEMENT DROP MATERIALIZED VIEW LOG DISABLE LOCAL 73

The ATP service enables all those for all users.

System parameters

Both services disable all ALTER SYSTEM statements except KILL SESSION. They allow all ALTER SYSTEM SET for the common users but allow only a few parameters to be changed by local users.

Basically, the ADW service sets the following:


_default_pct_free=1
_ldr_io_size=33554432
_ldr_io_size2=33554432
_max_io_size=33554432
_optimizer_allow_all_access_paths=FALSE
_optimizer_answering_query_using_stats=TRUE
optimizer_ignore_hints=TRUE
optimizer_ignore_parallel_hints=TRUE
parallel_degree_policy=AUTO
parallel_min_degree=CPU
result_cache_max_result=1
result_cache_mode=FORCE

The ATP service keeps the defaults.

Basically, all the features for OLTP are there for years in the Oracle Database. Under the ‘autonomous’ umbrella we can see some well-known features

  • Automatic provisioning: pluggable databases
  • Automatic scaling: PDB resource manager
  • Automatic tuning: SQL Plan Management, Adaptive Plans, SQL Tuning Advisor, Storage Indexes
  • Automatic security: Rolling patches, Encryption, Database Vault
  • Automatic Fault Tolerant Failover: RAC, Data Guard
  • Automatic Backup and Recovery: RMAN, Flashback

All these proven features are enhanced to work together in a consolidated multitenant environment. Povisioning is as fast as a CREATE PDB. The ATP and ADW services enable and configure the right set of features for each workload.

 

Cet article ATP vs ADW – the Autonomous Database lockdown profiles est apparu en premier sur Blog dbi services.

TRANSPORT_CONNECT_TIMEOUT and RETRY_COUNT

$
0
0

When you have a Data Guard configuration, you want the application to connect to the right server, where the primary is, without taking too much time. The default TCP timeout is 1 minute which is too long. When you don’t want to configure a virtual IP address (VIP) you can simply list all the addresses in the client connection string. But then you need to reduce the timeout. A short duration in 1 to 5 seconds will be ok most of the time, but in case of network issue, you want to give a chance to retry with a longer timeout. This post is about the connection string parameters to define this. Of course, all is documented but the goal of this post is also to show how to quickly test it. Because a reliable understanding of how it works relies on both documentation and test.

Here is a simple client failover configuration where the connection tries 10.10.10.10 and, if it fails, tries 10.10.10.11

DEFAULT=
(DESCRIPTION=
(CONNECT_DATA=(SERVICE_NAME=pdb1))
(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.10)(PORT=1521))
(ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.11)(PORT=1521))
)
)

The problem with that is when the 10.10.10.10 is down then the 10.10.10.11 will be tried only after 60 seconds, the default TCP timeout. You can completely avoid waiting for the timeout by using a virtual IP that will always be up, started on the failed-over server. But you can also reduce the TCP timeout to a few seconds.

Here is a tnsping with the above tnsnames.ora entry and when both servers are down:


$ time tnsping DESCRIPTION
 
TNS Ping Utility for Linux: Version 18.0.0.0.0 - Production on 10-AUG-2018 15:15:55
 
Copyright (c) 1997, 2018, Oracle. All rights reserved.
 
Used parameter files:
 
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION= (CONNECT_DATA=(SERVICE_NAME=pdb1)) (ADDRESS_LIST= (ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.10)(PORT=1521)) (ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.11)(PORT=1521))))
TNS-12535: TNS:operation timed out
 
real 2m0.051s
user 0m0.005s
sys 0m0.011s

That’s 2 minutes because there is a 1 minute timeout for each address.

TRANSPORT_CONNECT_TIMEOUT

Now, just adding the TRANSPORT_CONNECT_TIMEOUT to the connection string description to reduce the timout to 4 seconds:


DESCRIPTION=
(DESCRIPTION=
(CONNECT_DATA=(SERVICE_NAME=pdb1))
(TRANSPORT_CONNECT_TIMEOUT=4)
(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.10)(PORT=1521))
(ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.11)(PORT=1521))
)
)

The total time to get the answer from both addresses is 8 seconds – 4 second for each:

$ time tnsping DESCRIPTION
 
TNS Ping Utility for Linux: Version 18.0.0.0.0 - Production on 10-AUG-2018 15:15:55
 
Copyright (c) 1997, 2018, Oracle. All rights reserved.
 
Used parameter files:
 
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION= (CONNECT_DATA=(SERVICE_NAME=pdb1)) (TRANSPORT_CONNECT_TIMEOUT=4) (ADDRESS_LIST= (ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.10)(PORT=1521)) (ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.11)(PORT=1521))))
TNS-12535: TNS:operation timed out
 
real 0m8.023s
user 0m0.010s
sys 0m0.006s

RETRY_COUNT

If you lower the timeout, you may give a chance to retry a few times with RETRY_COUNT. There, RETRY_COUNT=2 will give 3 attempts ( 1 + 2 retries ) to the address list:

$ time tnsping RETRY_COUNT
 
TNS Ping Utility for Linux: Version 18.0.0.0.0 - Production on 10-AUG-2018 15:49:34
 
Copyright (c) 1997, 2018, Oracle. All rights reserved.
 
Used parameter files:
 
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION= (CONNECT_DATA=(SERVICE_NAME=pdb1)) (TRANSPORT_CONNECT_TIMEOUT=4) (RETRY_COUNT=2) (ADDRESS_LIST= (ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.10)(PORT=1521)) (ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.11)(PORT=1521))))
TNS-12535: TNS:operation timed out
 
real 0m24.049s
user 0m0.011s
sys 0m0.010s

This has tried 10.10.10.10 and then 10.10.10.11 for 4 seconds each, and then retried 2 times wich in total takes 2x4x4=24 seconds

DESCRIPTION_LIST

The TRANSPORT and RETRY_COUNT are used only in the DESCRIPTION. You may want to give several attempts with an increasing timeout. For example: try each address for one second to get a quick connection to the primary, wherever it is, when the network is in good health. Then give two attempts with a 5 seconds timeout for bad network times. And then one final attempt to each with the default timeout to be sure that the servers are down.

You can use a DESCRIPTION_LIST for this:

INCREASING=
(DESCRIPTION_LIST=
(LOAD_BALANCE=off)
(DESCRIPTION=
(CONNECT_DATA=(SERVICE_NAME=pdb1))
(TRANSPORT_CONNECT_TIMEOUT=1)
(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.10)(PORT=1521))
(ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.11)(PORT=1521))
)
)
(DESCRIPTION=
(CONNECT_DATA=(SERVICE_NAME=pdb1))
(TRANSPORT_CONNECT_TIMEOUT=5)
(RETRY_COUNT=1)
(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.10)(PORT=1521))
(ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.11)(PORT=1521))
)
)
(DESCRIPTION=
(CONNECT_DATA=(SERVICE_NAME=pdb1))
(TRANSPORT_CONNECT_TIMEOUT=2)
(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.10)(PORT=1521))
(ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.11)(PORT=1521))
)
)
)

Rather than just time the total attempts, I’ll strace each connections:

$ strace -tT tnsping INCREASING 2>&1 | grep -C1 --color=auto -E 'poll.*|inet_addr[()".0-9]*'
 
16:15:49 fcntl(4, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 <0.000008>
16:15:49 connect(4, {sa_family=AF_INET, sin_port=htons(1521), sin_addr=inet_addr("10.10.10.10")}, 16) = -1 EINPROGRESS (Operation now in progress) <0.000087>
16:15:49 times(NULL) = 434920117 <0.000011>
16:15:49 mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efce31bc000 <0.000013>
16:15:49 poll([{fd=4, events=POLLOUT}], 1, 1000) = 0 (Timeout) <1.001435>
16:15:50 close(4) = 0 <0.000256>
--
16:15:50 fcntl(4, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 <0.000060>
16:15:50 connect(4, {sa_family=AF_INET, sin_port=htons(1521), sin_addr=inet_addr("10.10.10.11")}, 16) = -1 EINPROGRESS (Operation now in progress) <0.000495>
16:15:50 times(NULL) = 434920218 <0.000062>
16:15:50 poll([{fd=4, events=POLLOUT}], 1, 1000) = 0 (Timeout) <1.000768>
16:15:51 close(4) = 0 <0.000050>
--
16:15:51 fcntl(4, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 <0.000015>
16:15:51 connect(4, {sa_family=AF_INET, sin_port=htons(1521), sin_addr=inet_addr("10.10.10.10")}, 16) = -1 EINPROGRESS (Operation now in progress) <0.000060>
16:15:51 times(NULL) = 434920318 <0.000010>
16:15:51 poll([{fd=4, events=POLLOUT}], 1, 5000) = 0 (Timeout) <5.005563>
16:15:56 close(4) = 0 <0.000027>
--
16:15:56 fcntl(4, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 <0.000012>
16:15:56 connect(4, {sa_family=AF_INET, sin_port=htons(1521), sin_addr=inet_addr("10.10.10.11")}, 16) = -1 EINPROGRESS (Operation now in progress) <0.000081>
16:15:56 times(NULL) = 434920819 <0.000015>
16:15:56 poll([{fd=4, events=POLLOUT}], 1, 5000) = 0 (Timeout) <5.006265>
16:16:01 close(4) = 0 <0.000192>
--
16:16:01 fcntl(4, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 <0.000079>
16:16:01 connect(4, {sa_family=AF_INET, sin_port=htons(1521), sin_addr=inet_addr("10.10.10.10")}, 16) = -1 EINPROGRESS (Operation now in progress) <0.000486>
16:16:01 times(NULL) = 434921320 <0.000087>
16:16:01 poll([{fd=4, events=POLLOUT}], 1, 5000) = 0 (Timeout) <5.004660>
16:16:06 close(4) = 0 <0.000611>
--
16:16:06 fcntl(4, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 <0.000114>
16:16:06 connect(4, {sa_family=AF_INET, sin_port=htons(1521), sin_addr=inet_addr("10.10.10.11")}, 16) = -1 EINPROGRESS (Operation now in progress) <0.000536>
16:16:06 times(NULL) = 434921822 <0.000097>
16:16:06 poll([{fd=4, events=POLLOUT}], 1, 5000) = 0 (Timeout) <5.008128>
16:16:11 close(4) = 0 <0.000135>
--
16:16:11 fcntl(4, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 <0.000137>
16:16:11 connect(4, {sa_family=AF_INET, sin_port=htons(1521), sin_addr=inet_addr("10.10.10.10")}, 16) = -1 EINPROGRESS (Operation now in progress) <0.000584>
16:16:11 times(NULL) = 434922323 <0.000079>
16:16:11 poll([{fd=4, events=POLLOUT}], 1, 60000) = 0 (Timeout) <60.053782>
16:17:11 close(4) = 0 <0.000166>
--
16:17:11 fcntl(4, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 <0.000195>
16:17:11 connect(4, {sa_family=AF_INET, sin_port=htons(1521), sin_addr=inet_addr("10.10.10.11")}, 16) = -1 EINPROGRESS (Operation now in progress) <0.000549>
16:17:11 times(NULL) = 434928329 <0.000488>
16:17:11 poll([{fd=4, events=POLLOUT}], 1, 60000) = 0 (Timeout) <60.007246>
16:18:11 close(4) = 0 <0.000043>

With ‘-T’ strace shows the duration of the poll() system call between brackets after the return code. You can see here 1-second timeout attempts to each address, then 2 attempts with 5 seconds timeout and then 60 seconds.

Note that I have added (LOAD_BALANCE=OFF) here because the default is ON in a DESCRIPTION_LIST but here I want to take them in the order I specified them.

 

Cet article TRANSPORT_CONNECT_TIMEOUT and RETRY_COUNT est apparu en premier sur Blog dbi services.

18c runInstaller -silent

$
0
0

You find two different ‘runInstaller’ under an Oracle Home. The old one, the Oracle Universal Installer, in $ORACLE_HOME/oui/bin. And the new one, in $ORACLE_HOME directly. They have the same name but are completely different. The old one was used to install an Oracle Home from the installation media. But in 18c you don’t use it. It has been used by Oracle to build the Oracle Home image. Then you download and unzip directly your Oracle Home. You have only to configure it and re-link the binaries. And this is done by the new runInstaller which is at the root of the Oracle Home. Actually, it is just a shell script that runs the Perl dbSetup.pl to setup the Oracle Database software. In my opinion, it would be better to have it called dbSetup.sh rather than rename it to runInstaller, especially given that the same thing for Grid Infrastructure is called GridSetup.sh since 12cR2. The Perl script finally runs the Java GUI. It can also be run in command line, aka silent mode, which is the goal of this post. The command line arguments are similar, but not the same as in the old runInstaller.

Prerequisites

You may want to run the prerequisites only to check if your system is ready for the installation. Here is how to do so in command line:

$ $ORACLE_HOME/runInstaller -silent -executePrereqs -responseFile $ORACLE_HOME/inventory/response/db_install.rsp
 
Launching Oracle Database Setup Wizard...
 
[FATAL] [INS-13013] Target environment does not meet some mandatory requirements.
CAUSE: Some of the mandatory prerequisites are not met. See logs for details. /u00/app/oraInventory/logs/InstallActions2018-08-11_06-07-14PM/installActions2018-08-11_06-07-14PM.log
ACTION: Identify the list of failed prerequisite checks from the log: /u00/app/oraInventory/logs/InstallActions2018-08-11_06-07-14PM/installActions2018-08-11_06-07-14PM.log. Then either from the log file or from installation manual find the appropriate configuration to meet the prerequisites and fix it manually.

From there we can check the log about the tests that have failed, such as in the following example:

INFO: [Aug 11, 2018 6:08:21 PM] Physical Memory: This is a prerequisite condition to test whether the system has at least 8GB (8388608.0KB) of total physical memory.
INFO: [Aug 11, 2018 6:08:21 PM] Severity:IGNORABLE
INFO: [Aug 11, 2018 6:08:21 PM] OverallStatus:VERIFICATION_FAILED
INFO: [Aug 11, 2018 6:08:21 PM] *********************************************
INFO: [Aug 11, 2018 6:08:21 PM] Run Level: This is a prerequisite condition to test whether the system is running with proper run level.
INFO: [Aug 11, 2018 6:08:21 PM] Severity:CRITICAL
INFO: [Aug 11, 2018 6:08:21 PM] OverallStatus:VERIFICATION_FAILED
INFO: [Aug 11, 2018 6:08:21 PM] *********************************************
INFO: [Aug 11, 2018 6:08:21 PM] OS Kernel Version: This is a prerequisite condition to test whether the system kernel version is at least "2.6.39-400.211.1".
INFO: [Aug 11, 2018 6:08:21 PM] Severity:CRITICAL
INFO: [Aug 11, 2018 6:08:21 PM] OverallStatus:VERIFICATION_FAILED

Software Install

You can pass all parameters in command line (‘runInstaller -silent -help’ to see all possibilities), but in all cases you need a response file. Then I put everything I need in the response file. There’s no mention of the ORACLE_HOME because you already unzipped it at the right place. The most important is the edition which seems to accept [EE, SEONE, SE2, HP, XP, PE]. I didn’t try it but Standard Edition One is for versions <= 12.1.0.1 by the way.

cd $ORACLE_HOME
 
cat > db18EE.rsp <<END
oracle.install.responseFileVersion=/oracle/install/rspfmt_dbinstall_response_schema_v18.0.0
oracle.install.option=INSTALL_DB_SWONLY
UNIX_GROUP_NAME=oinstall
INVENTORY_LOCATION=/u00/app/oraInventory
ORACLE_BASE=/u00/app/oracle
oracle.install.db.InstallEdition=EE
oracle.install.db.OSDBA_GROUP=dba
oracle.install.db.OSOPER_GROUP=oper
oracle.install.db.OSBACKUPDBA_GROUP=backupdba
oracle.install.db.OSDGDBA_GROUP=dgdba
oracle.install.db.OSKMDBA_GROUP=kmdba
oracle.install.db.OSRACDBA_GROUP=dba
END

There is no need for the oracle.install.db.config variables because I’ll install the software only without creating a database.

Here is how to run the dbSetup. You can use ‘-ignorePrereqFailure’ to ignore the prerequisites if you want to install to a host where some prerequisites fail:

./runInstaller -silent -noconfig -ignorePrereqFailure -responseFile ./db18EE.rsp

 
The log of the installation goes into the oraInventory/logs and, as usual, you have to run the root.sh

As a root user, execute the following script(s):
1. /u00/app/oracle/product/18SE/root.sh
 
Execute /u00/app/oracle/product/18SE/root.sh on the following nodes:
[vmreforatun01]

This new runInstaller can also apply one-off patches with -applyOneOffs mentioning the patch locations. You can also build an Oracle Home image that you customize, with -createGoldImage -destinationLocation and even mention some files or path to exclude to make it smaller: -exclFiles

 

Cet article 18c runInstaller -silent est apparu en premier sur Blog dbi services.

ORACLE_HOME with symbolic link and postupgrade_fixups

$
0
0

Here is a quick post you may google into if you got the following error when running postupgrade_fixups.sql after an upgrade:

ERROR - Cannot open the preupgrade_messages.properties file from the directory object preupgrade_dir
DECLARE
*
ERROR at line 1:
ORA-29283: invalid file operation
ORA-06512: at "SYS.DBMS_PREUP", line 3300
ORA-06512: at "SYS.UTL_FILE", line 536
ORA-29283: invalid file operation
ORA-06512: at "SYS.UTL_FILE", line 41
ORA-06512: at "SYS.UTL_FILE", line 478
ORA-06512: at "SYS.DBMS_PREUP", line 3260
ORA-06512: at "SYS.DBMS_PREUP", line 9739
ORA-06512: at line 11


Before upgrading a database with dbupgrade, you run, on the current version of your database, the preupgrade.jar from the new version (and probably download the lastest one from MOS). This generates a script to run before the upgrade, and one to run after the upgrade. Those scripts are generated under $ORACLE_BASE/cfgtoollogs/<database>/preupgrade where you find something like that:

drwxr-xr-x. 3 oracle oinstall 4096 Aug 11 19:36 ..
drwxr-xr-x. 3 oracle oinstall 4096 Aug 11 19:36 oracle
drwxr-xr-x. 3 oracle oinstall 4096 Aug 11 19:36 upgrade
-rw-r--r--. 1 oracle oinstall 14846 Aug 11 20:19 dbms_registry_extended.sql
-rw-r--r--. 1 oracle oinstall 7963 Aug 11 20:19 preupgrade_driver.sql
-rw-r--r--. 1 oracle oinstall 422048 Aug 11 20:19 preupgrade_package.sql
-rw-r--r--. 1 oracle oinstall 14383 Aug 11 20:19 parameters.properties
-rw-r--r--. 1 oracle oinstall 83854 Aug 11 20:19 preupgrade_messages.properties
-rw-r--r--. 1 oracle oinstall 50172 Aug 11 20:19 components.properties
-rw-r--r--. 1 oracle oinstall 2 Aug 11 20:19 checksBuffer.tmp
-rw-r--r--. 1 oracle oinstall 6492 Aug 11 20:20 preupgrade_fixups.sql
-rw-r--r--. 1 oracle oinstall 7575 Aug 11 20:20 postupgrade_fixups.sql
-rw-r--r--. 1 oracle oinstall 5587 Aug 11 20:20 preupgrade.log

Everything is straightforward.

oracle@vmreforatun01:/u00/app/oracle/product/ [DB2] java -jar /u00/app/oracle/product/18EE/rdbms/admin/preupgrade.jar
...
==================
PREUPGRADE SUMMARY
==================
/oracle/u00/app/oracle/cfgtoollogs/DB2/preupgrade/preupgrade.log
/oracle/u00/app/oracle/cfgtoollogs/DB2/preupgrade/preupgrade_fixups.sql
/oracle/u00/app/oracle/cfgtoollogs/DB2/preupgrade/postupgrade_fixups.sql
 
Execute fixup scripts as indicated below:
 
Before upgrade log into the database and execute the preupgrade fixups
@/oracle/u00/app/oracle/cfgtoollogs/DB2/preupgrade/preupgrade_fixups.sql
 
After the upgrade:
 
Log into the database and execute the postupgrade fixups
@/oracle/u00/app/oracle/cfgtoollogs/DB2/preupgrade/postupgrade_fixups.sql
 
Preupgrade complete: 2018-08-11T19:37:29
oracle@vmreforatun01:/u00/app/oracle/product/ [DB2]

For a database we have in a lab for our workshops, which I upgraded to 18c, I’ve run the postfix script after the upgrade but got the error mentioned above about UTL_FILE invalid file operation in the preupgrade_dir. I looked at the script. The postupgrade_fixups.sql script creates a directory on $ORACLE_HOME/rdbms/admin and calls preupgrade_package.sql which reads preupgrade_messages.properties.

This is a bit confusing because there’s also the same file in the cfgtoollogs preupgrade subdirectory but my directory looks good:

SQL> select directory_name,directory_path from dba_directories where directory_name='PREUPGRADE_DIR';
 
DIRECTORY_NAME
--------------------------------------------------------------------------------------------------------------------------------
DIRECTORY_PATH
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
PREUPGRADE_DIR
/u00/app/oracle/product/18SE/rdbms/admin

So, as the “ORA-29283: invalid file operation” is not very detailed, I traced all the system calls on files (strace -fye trace=file) when running sqlplus and got this:

[pid 29974] clock_gettime(CLOCK_MONOTONIC, {17811, 723389136}) = 0
[pid 29974] stat("/u00/app/oracle/product/18SE/rdbms/admin/preupgrade_messages.properties", {st_mode=S_IFREG|0644, st_size=83854, ...}) = 0
[pid 29974] stat("/u00/app/oracle/product/18SE/rdbms/admin/", {st_mode=S_IFDIR|0755, st_size=65536, ...}) = 0
[pid 29974] lstat("/u00", {st_mode=S_IFLNK|0777, st_size=11, ...}) = 0
[pid 29974] readlink("/u00", "/oracle/u00", 4095) = 11
[pid 29974] lstat("/oracle", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid 29974] lstat("/oracle/u00", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid 29974] lstat("/oracle/u00/app", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid 29974] lstat("/oracle/u00/app/oracle", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid 29974] lstat("/oracle/u00/app/oracle/product", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid 29974] lstat("/oracle/u00/app/oracle/product/18SE", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid 29974] lstat("/oracle/u00/app/oracle/product/18SE/rdbms", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid 29974] lstat("/oracle/u00/app/oracle/product/18SE/rdbms/admin", {st_mode=S_IFDIR|0755, st_size=65536, ...}) = 0
[pid 29974] clock_gettime(CLOCK_MONOTONIC, {17811, 724514469}) = 0

Then I realized that the ORACLE_HOME is under a symbolic link. For whatever reason, on this environment, ORACLE_BASE is physically /oracle/u00/app/oracle but there’s a /u00 link to /oracle/u00 and this short one was used to set the environment variables. UTL_FILE, since 11g, and for security reasons, does not accept directories which use a symbolic link. And we can see on the strace above that it was detected (readlink).

So, the solution can be a quick workaround here, changing the postupgrade_fixups.sql to set the physical path instead of the one read from ORACLE_HOME by dbms_system.get_env.

However, if you can restart the instance, then it will be better to set the ORACLE_HOME to the physical path. Symbolic links for the ORACLE_HOME may be misleading. Remember that the ORACLE_HOME text string is part of the instance identification, combined with ORACLE_SID. So, having different values even when resolved to the same path will bring lot of problems. Do not forget to change it everywhere (shell environment, listener.ora) so that you are sure that nobody will use a different one when starting the database.

 

Cet article ORACLE_HOME with symbolic link and postupgrade_fixups est apparu en premier sur Blog dbi services.

Oracle Database on OpenShift

$
0
0

By Franck Pachot

.
In a previous post I described the setup of MiniShift on my laptop in order to run OpenShift for test purpose. I even pulled the Oracle Database image from the Docker Store. But the goal is to import it into OpenShift to deploy it from the Image Stream.

I start MiniShift on my laptop, specifying a larger disk (default is 20GB)

C:\Users\Franck>minishift start --disk-size 40g
-- Starting profile 'minishift'
-- Check if deprecated options are used ... OK
-- Checking if https://github.com is reachable ... OK
-- Checking if requested OpenShift version 'v3.9.0' is valid ... OK
-- Checking if requested OpenShift version 'v3.9.0' is supported ... OK
-- Checking if requested hypervisor 'virtualbox' is supported on this platform ... OK
-- Checking if VirtualBox is installed ... OK
-- Checking the ISO URL ... OK
-- Checking if provided oc flags are supported ... OK
-- Starting the OpenShift cluster using 'virtualbox' hypervisor ...
-- Minishift VM will be configured with ...
Memory: 2 GB
vCPUs : 2
Disk size: 40 GB
-- Starting Minishift VM .................................................................... OK
-- Checking for IP address ... OK
-- Checking for nameservers ... OK
-- Checking if external host is reachable from the Minishift VM ...
Pinging 8.8.8.8 ... OK
-- Checking HTTP connectivity from the VM ...
Retrieving http://minishift.io/index.html ... OK
-- Checking if persistent storage volume is mounted ... OK
-- Checking available disk space ... 1% used OK
Importing 'openshift/origin:v3.9.0' ............. OK
Importing 'openshift/origin-docker-registry:v3.9.0' ... OK
Importing 'openshift/origin-haproxy-router:v3.9.0' ...... OK
-- OpenShift cluster will be configured with ...
Version: v3.9.0
-- Copying oc binary from the OpenShift container image to VM ... OK
-- Starting OpenShift cluster ...........................................................
Using nsenter mounter for OpenShift volumes
Using public hostname IP 192.168.99.105 as the host IP
Using 192.168.99.105 as the server IP
Starting OpenShift using openshift/origin:v3.9.0 ...
OpenShift server started.
 
The server is accessible via web console at:
https:⁄⁄192.168.99.105:8443
 
You are logged in as:
User: developer
Password:
 
To login as administrator:
oc login -u system:admin

MiniShift is starting a VirualBox and gets an IP address from the VirtualBox DHCP – here 192.168.99.105
I can access to the console https://192.168.99.105:8443 and log as developer or admin but for the moment I’m continuing in command line.

At any moment I can log to the VM running OpenShift with the minishift command. Here checking the size of the disks

C:\Users\Franck>minishift ssh
 
[docker@minishift ~]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/live-rw 9.8G 697M 9.0G 8% /
devtmpfs 974M 0 974M 0% /dev
tmpfs 1000M 0 1000M 0% /dev/shm
tmpfs 1000M 18M 983M 2% /run
tmpfs 1000M 0 1000M 0% /sys/fs/cgroup
/dev/sr0 344M 344M 0 100% /run/initramfs/live
/dev/sda1 39G 1.8G 37G 5% /mnt/sda1
tmpfs 200M 0 200M 0% /run/user/0
tmpfs 200M 0 200M 0% /run/user/1000

Build the Docker image

The goal is to run in OpenShift a container from an image that has been build somewhere else. In this example I’ll not build one but use one provided on the Docker store: the Oracle Database ‘slim’ image. For this example, I’ll use the minishift VM docker, just because it is there.

I have DockerTools installed on my laptop and just want to set the environment to connect to the docker server on the minishift VM. I can get the environment from minishift:

C:\Users\Franck>minishift docker-env
SET DOCKER_TLS_VERIFY=1
SET DOCKER_HOST=tcp://192.168.99.105:2376
SET DOCKER_CERT_PATH=C:\Users\Franck\.minishift\certs
REM Run this command to configure your shell:
REM @FOR /f "tokens=*" %i IN ('minishift docker-env') DO @call %i

Here is how to directly set the environemnt from it:

C:\Users\Franck>@FOR /f "tokens=*" %i IN ('minishift docker-env') DO @call %i

Now my docker commands will connect to this docker server. Here are the related info, minishift is already running several containers there for its own usage:

C:\Users\Franck>docker info
Containers: 9
Running: 7
Paused: 0
Stopped: 2
Images: 6
Server Version: 1.13.1
Storage Driver: overlay2
Backing Filesystem: xfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: journald
Cgroup Driver: systemd
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log:
Swarm: inactive
Runtimes: docker-runc runc
Default Runtime: docker-runc
Init Binary: docker-init
containerd version: (expected: aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1)
runc version: e9c345b3f906d5dc5e8100b05ce37073a811c74a (expected: 9df8b306d01f59d3a8029be411de015b7304dd8f)
init version: N/A (expected: 949e6facb77383876aeff8a6944dde66b3089574)
Security Options:
seccomp
Profile: default
selinux
Kernel Version: 3.10.0-862.6.3.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 1.953GiB
Name: minishift
ID: U7IQ:TE3X:HSGK:3ES2:IO6G:A7VI:3KUU:YMBC:3ZIR:QYUL:EQUL:VFMS
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Username: pachot
Registry: https://index.docker.io/v1/
Labels:
provider=virtualbox
Experimental: false
Insecure Registries:
172.30.0.0/16
127.0.0.0/8
Live Restore Enabled: false

As for this example, I’ll use the Oracle Database image, I need to log to the Docker Store to prove that I accept the licensing conditions:

C:\Users\Franck>docker login
Login with your Docker ID to push and pull images from Docker Hub. If you don't have a Docker ID, head over to https://hub.docker.com to create one.
Username:
Password:
Login Succeeded

I pull the image, takes some time because ‘slim’ means 2GB with Oracle Database.

C:\Users\Franck>docker pull store/oracle/database-enterprise:12.2.0.1-slim
Trying to pull repository docker.io/store/oracle/database-enterprise ...
12.2.0.1-slim: Pulling from docker.io/store/oracle/database-enterprise
4ce27fe12c04: Pull complete
9d3556e8e792: Pull complete
fc60a1a28025: Pull complete
0c32e4ed872e: Pull complete
be0a1f1e8dfd: Pull complete
Digest: sha256:dbd87ae4cc3425dea7ba3d3f34e062cbd0afa89aed2c3f3d47ceb5213cc0359a
Status: Downloaded newer image for docker.io/store/oracle/database-enterprise:12.2.0.1-slim

Here is the image:

C:\Users\Franck>docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
openshift/origin-web-console v3.9.0 aa12a2fc57f7 7 weeks ago 495MB
openshift/origin-docker-registry v3.9.0 0530b896b578 7 weeks ago 465MB
openshift/origin-haproxy-router v3.9.0 6b85d7aec983 7 weeks ago 1.28GB
openshift/origin-deployer v3.9.0 39ee47797d2e 7 weeks ago 1.26GB
openshift/origin v3.9.0 12a3f005312b 7 weeks ago 1.26GB
openshift/origin-pod v3.9.0 6e08365fbba9 7 weeks ago 223MB
store/oracle/database-enterprise 12.2.0.1-slim 27c9559d36ec 12 months ago 2.08GB

My minishift VM disk has increased by 2GB:

C:\Users\Franck>minishift ssh -- df -Th /mnt/sda1
Filesystem Type Size Used Avail Use% Mounted on
/dev/sda1 xfs 39G 3.9G 35G 11% /mnt/sda1

Push the image to OpenShift registry

OpenShift has its integrated container registry from which the Docker images are visible to Image Stream.
Here is the address of the registry:

C:\Users\Franck>minishift openshift registry
172.30.1.1:5000

I’ll run some OpenShift commands and the path to the minishift cache for ‘oc’ can be set with:

C:\Users\Franck>minishift oc-env
SET PATH=C:\Users\Franck\.minishift\cache\oc\v3.9.0\windows;%PATH%
REM Run this command to configure your shell:
REM @FOR /f "tokens=*" %i IN ('minishift oc-env') DO @call %i
 
C:\Users\Franck>@FOR /f "tokens=*" %i IN ('minishift oc-env') DO @call %i

I am still connected as developer to OpenShift:

C:\Users\Franck>oc whoami
developer

and I get the login token:

C:\Users\Franck>oc whoami -t
lde5zRPHjkDyaXU9ninZ6zX50cVu3liNBjQVinJdwFc

I use this token to login to the OpenShift registry with docker in order to be able to push the image:

C:\Users\Franck>docker login -u developer -p lde5zRPHjkDyaXU9ninZ6zX50cVu3liNBjQVinJdwFc 172.30.1.1:5000
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
Login Succeeded

I create a new project to import the image to:

C:\Users\Franck>oc new-project oracle --display-name=Oracle
Now using project "oracle" on server "https://192.168.99.105:8443".
 
You can add applications to this project with the 'new-app' command. For example, try:
 
oc new-app centos/ruby-22-centos7~https://github.com/openshift/ruby-ex.git
 
to build a new example application in Ruby.

This can also be done from the GUI. Here is the project on the right:
CaptureOpenShiftProject

I tag the image with the name of the registry (172.30.1.1:5000) and the name of the project (oracle) and add an image name, so that the full name is: 172.30.1.1:5000/oracle/ora122slim

C:\Users\Franck>docker tag store/oracle/database-enterprise:12.2.0.1-slim 172.30.1.1:5000/oracle/ora122slim

We can see this tagged image

C:\Users\Franck>docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
openshift/origin-web-console v3.9.0 aa12a2fc57f7 7 weeks ago 495MB
openshift/origin-docker-registry v3.9.0 0530b896b578 7 weeks ago 465MB
openshift/origin-haproxy-router v3.9.0 6b85d7aec983 7 weeks ago 1.28GB
openshift/origin-deployer v3.9.0 39ee47797d2e 7 weeks ago 1.26GB
openshift/origin v3.9.0 12a3f005312b 7 weeks ago 1.26GB
openshift/origin-pod v3.9.0 6e08365fbba9 7 weeks ago 223MB
172.30.1.1:5000/oracle/ora122slim latest 27c9559d36ec 12 months ago 2.08GB
store/oracle/database-enterprise 12.2.0.1-slim 27c9559d36ec 12 months ago 2.08GB

Note that it is the same IMAGE ID and doesn’t take more space:

C:\Users\Franck>minishift ssh -- df -Th /mnt/sda1
Filesystem Type Size Used Avail Use% Mounted on
/dev/sda1 xfs 39G 3.9G 35G 11% /mnt/sda1

Then I’m finally ready to pull the image to the OpenShift docker registry:

C:\Users\Franck>docker push 172.30.1.1:5000/oracle/ora122slim
The push refers to a repository [172.30.1.1:5000/oracle/ora122slim] 066e811424fb: Pushed
99d7f2451a1a: Pushed
a2c532d8cc36: Pushed
49c80855196a: Pushed
40c24f62a02f: Pushed
latest: digest: sha256:25b0ec7cc3987f86b1e754fc214e7f06761c57bc11910d4be87b0d42ee12d254 size: 1372

This is a copy, and takes an additional 2GB:

C:\Users\Franck>minishift ssh -- df -Th /mnt/sda1
Filesystem Type Size Used Avail Use% Mounted on
/dev/sda1 xfs 39G 5.4G 33G 14% /mnt/sda1

Deploy the image

Finally, I can deploy the image as it is visible in the GUI:
CaptureOpenShiftImport

I choose to deploy from fommand line:

C:\Users\Franck>oc new-app --image-stream=ora122slim --name=ora122slimdeployment
--> Found image 27c9559 (12 months old) in image stream "oracle/ora122slim" under tag "latest" for "ora122slim"
 
* This image will be deployed in deployment config "ora122slimdeployment"
* Ports 1521/tcp, 5500/tcp will be load balanced by service "ora122slimdeployment"
* Other containers can access this service through the hostname "ora122slimdeployment"
* This image declares volumes and will default to use non-persistent, host-local storage.
You can add persistent volumes later by running 'volume dc/ora122slimdeployment --add ...'

--> Creating resources ...
imagestreamtag "ora122slimdeployment:latest" created
deploymentconfig "ora122slimdeployment" created
service "ora122slimdeployment" created
--> Success
Application is not exposed. You can expose services to the outside world by executing one or more of the commands below:
'oc expose svc/ora122slimdeployment'
Run 'oc status' to view your app.

CaptureOpenShiftDeploy

I expose the service:

C:\Users\Franck>oc expose service ora122slimdeployment
route "ora122slimdeployment" exposed

/bin/bash: /home/oracle/setup/dockerInit.sh: Permission denied

Here is one little thing to change. From the POD terminal, I can see the following error:
CaptureOpenShiftCrash

The same can be read from command line:

C:\Users\Franck>oc status
In project Oracle (oracle) on server https://192.168.99.105:8443
 
http://ora122slimdeployment-oracle.192.168.99.105.nip.io to pod port 1521-tcp (svc/ora122slimdeployment)
dc/ora122slimdeployment deploys istag/ora122slim:latest
deployment #1 deployed 7 minutes ago - 0/1 pods (warning: 6 restarts)
 
Errors:
* pod/ora122slimdeployment-1-86prl is crash-looping
 
1 error, 2 infos identified, use 'oc status -v' to see details.
 
C:\Users\Franck>oc logs ora122slimdeployment-1-86prl -c ora122slimdeployment
/bin/bash: /home/oracle/setup/dockerInit.sh: Permission denied

This is because by default, for security reason, OpenShift runs the container with a random user id. But the files are executable only by oracle:

sh-4.2$ ls -l /home/oracle/setup/dockerInit.sh
-rwxr-xr--. 1 oracle oinstall 2165 Aug 17 2017 /home/oracle/setup/dockerInit.sh
sh-4.2$

The solution is quite simple: allow the container to run with its own user id:

C:\Users\Franck>minishift addon apply anyuid
-- Applying addon 'anyuid':.
Add-on 'anyuid' changed the default security context constraints to allow pods to run as any user.
Per default OpenShift runs containers using an arbitrarily assigned user ID.
Refer to https://docs.openshift.org/latest/architecture/additional_concepts/authorization.html#security-context-constraints and
https://docs.openshift.org/latest/creating_images/guidelines.html#openshift-origin-specific-guidelines for more information.

The the restart of the POD will go further:
CaptureOpenShiftOracle

This Oracle Database from the Docker Store is not really an image of an installed Oracle Database, but just a tar of Oracle Home and Database files that have to be untared.

Now, in addition to the image size I have an additional 2GB layer for the container:

C:\Users\Franck>minishift ssh -- df -Th /mnt/sda1
Filesystem Type Size Used Avail Use% Mounted on
/dev/sda1 xfs 39G 11G 28G 28% /mnt/sda1
 
C:\Users\Franck>docker system df
TYPE TOTAL ACTIVE SIZE RECLAIMABLE
Images 7 6 3.568GB 1.261GB (35%)
Containers 17 9 1.895GB 58.87kB (0%)
Local Volumes 0 0 0B 0B
Build Cache 0B 0B

Of course there is more to customize. The minishift VM should have more memory and the container for Oracle Database as well. We probably want to add an external volume, and export ports outside of the minishift VM.

 

Cet article Oracle Database on OpenShift est apparu en premier sur Blog dbi services.


The size of Oracle Home: from 9GB to 600MB

$
0
0

This is research only and totally unsupported. When building docker images to run Oracle Database in a container, we try to get the smallest image possible. One way is to remove some subdirectories that we know will not be used. For example, the patch history is not used anymore once we have the required version. The dbca templates can be removed as soon as we have created the database… In this post I take the opposite approach: run some workload on a normal Oracle Home, and keep only the files that were used.

I have Oracle Database 18c installed in /u00/app/oracle/product/18EE and it takes 9GB on my host:

[oracle@vmreforatun01 ~]$ du --human-readable --max-depth=1 $ORACLE_HOME | sort -h | tail -10
 
352M /u00/app/oracle/product/18EE/jdk
383M /u00/app/oracle/product/18EE/javavm
423M /u00/app/oracle/product/18EE/inventory
437M /u00/app/oracle/product/18EE/assistants
605M /u00/app/oracle/product/18EE/md
630M /u00/app/oracle/product/18EE/bin
673M /u00/app/oracle/product/18EE/apex
1.4G /u00/app/oracle/product/18EE/.patch_storage
2.3G /u00/app/oracle/product/18EE/lib
9.4G /u00/app/oracle/product/18EE

Gigabytes of libraries (most of them used only to link the executables), hundreds of megabytes of binaries, templates for new databases, applied patches, old object files, options, tools, command line and graphical interfaces,… Do we need all that?

For a full installation in production, yes for sure. The more we have, the better it is. When you have to connect at 2 a.m because you are on-call and a critical alert wakes you up, then you will appreciate to have all tools on the server. Especially if you connect through a few security obstacles such as remote VPN, desktop, Wallix, tunnels to finally get a high latency tty with no copy-paste possibilities. With a full Oracle Home, you can face any issue. You have efficient command line interfaces (sqlplus and hopefully sqlcl) or graphical (SQLDeveloper, asmca,…). For severe problems, you can even re-link, apply or rollback patches, quickly create a new database to import something in it,…

But what if you just want to provide a small container where a database is running, and no additional administration support? Where you will never re-install the software, apply patches, re-create the database, troubleshoot weird issues. Just have users connect through the listener port and never log to the container. Then, most of these 9.4 GB are useless.

But how to know which files are useful or not?

If you can rely on Linux ‘access time’ then you may look at the files accessed during the last days – after any installation or database creation is done:

[oracle@vmreforatun01 ~]$ find $ORACLE_HOME -atime -1 -exec stat -L -c "%x %y %z %F %n" {} \; | sort

But this is not reliable. Access time depends on the file type, filesystem, mount options,… and is usually bypassed as much as possible because writing something just to log that you read something is not a very good idea.

Here, I’ll trace all system calls related to file names (strace -e trace=file). I’ll trace them from the start of the database, so that I run strace on dbstart with the -f arguments to trace across forks. Then, I’ll trace the listener, the instance processes and any user process created through the listener.

I pipe the output to an awk script which extracts the file names (which is enclosed in double quotes in the strace output). Basically, the awk is just setting the field separator with -F” and prints the $2 token for each line. There are many single and double quotes here because of shell interpretation.

[oracle@vmreforatun01 ~]$ dbshut $ORACLE_HOME ; strace -fe trace=file -o "|awk -F'"'"'"' '"'{print $2}'"'" sh -xc "dbstart $ORACLE_HOME >&2" | grep "^$ORACLE_HOME" | sort -u > /tmp/files.txt &

Then I run some activity. I did this on our Oracle Tuning training workshop lab, when reviewing all exercises after upgrading the lab VM to 18c. This runs some usual SQL for application (we use Swingbench) and monitoring. The idea is to run through all features that you want to be available on the container you will build.

When I’m done, I dbshut (remember this is for a lab only – strace is not for production) and then strace output gets deduplicated (sort -u) and written to a file.txt in /tmp.

This file contains all files referenced by system calls. Surprisingly, there is one that is not captured here, the ldap messages file, but if I do not take it then the remote connections will fail with:

ORA-07445: exception encountered: core dump [gslumcCalloc()+41] [SIGSEGV] [ADDR:0x21520] [PC:0x60F92D9] [Address not mapped to object] []

I got it with a very empirical approach, will try to understand later. For the moment, I just add it to the list:

[oracle@vmreforatun01 ~]$ ls $ORACLE_HOME/ldap/mesg/ldapus.msb >> /tmp/files.txt

I also add adrci and dbshut scripts as they are small and may be useful:

[oracle@vmreforatun01 ~]$ ls $ORACLE_HOME/bin/adrci $ORACLE_HOME/bin/dbshut >> /tmp/files.txt

From this list, I check thise which are not directories, and tar all regular files and symbolic links into /tmp/smalloh.tar:

[oracle@vmreforatun01 ~]$ stat -c "%F %n" $(cat /tmp/files.txt) | awk '!/^directory/{print $3}' | tar -cvf /tmp/smalloh.tar --dereference --files-from=-

This is a 600M tar:

[oracle@vmreforatun01 ~]$ du -h /tmp/smalloh.tar
 
598M /tmp/smalloh.tar

Then I can remove my Oracle Home

[oracle@vmreforatun01 ~]$ cd $ORACLE_HOME/..
[oracle@vmreforatun01 product]$ rm -rf 18EE
[oracle@vmreforatun01 product]$ mkdir 18EE

and extract the files from my tar:

[oracle@vmreforatun01 /]$ tar -xf /tmp/smalloh.tar

I forgot that there are some setuid executables so I must be root to set them:

[oracle@vmreforatun01 /]$ ls -l $ORACLE_HOME/bin/oracle
-rwxr-x--x. 1 oracle oinstall 437157251 Aug 11 18:40 /u00/app/oracle/product/18EE/bin/oracle
[oracle@vmreforatun01 /]$ su
Password:
[root@vmreforatun01 /]# tar -xf /tmp/smalloh.tar
[root@vmreforatun01 /]# exit
[oracle@vmreforatun01 /]$ ls -l $ORACLE_HOME/bin/oracle
-rwsr-s--x. 1 oracle oinstall 437157251 Aug 11 18:40 /u00/app/oracle/product/18EE/bin/oracle

That’s a 600MB Oracle Home then. You can reduce it further by stripping the binaries:

[oracle@vmreforatun01 18EE]$ du -hs $ORACLE_HOME
599M /u00/app/oracle/product/18EE
[oracle@vmreforatun01 18EE]$ strip $ORACLE_HOME/bin/* $ORACLE_HOME/lib/*
[oracle@vmreforatun01 18EE]$ du -hs $ORACLE_HOME
570M /u00/app/oracle/product/18EE

but for only 30MB I really prefer to have all symbols. As I’m doing something completely unsupported, I may have to do some toubleshooting.

Now I’m ready to start the database and the listener:

[oracle@vmreforatun01 18EE]$ dbstart $ORACLE_HOME
Processing Database instance "DB1": log file /u00/app/oracle/product/18EE/rdbms/log/startup.log

and I run some Swingbench workload to check that everything is fine:

[oracle@vmreforatun01 18EE]$ /home/oracle/swingbench/bin/charbench -cs //localhost:1521/APP -u soe -p soe -uc 10 -min 5 -max 20 -a -v
Author : Dominic Giles
Version : 2.5.0.932
 
Results will be written to results.xml.
 
Time Users TPM TPS
 
6:35:15 PM 0 0 0
...
6:35:44 PM 10 12 9
6:35:45 PM 10 16 4
6:35:46 PM 10 21 5
6:35:47 PM 10 31 10

The only errors in alert.log are about checking the patches at install:

QPI: OPATCH_INST_DIR not present:/u00/app/oracle/product/18EE/OPatch
Unable to obtain current patch information due to error: 20013, ORA-20013: DBMS_QOPATCH ran mostly in non install area
ORA-06512: at "SYS.DBMS_QOPATCH", line 767
ORA-06512: at "SYS.DBMS_QOPATCH", line 547
ORA-06512: at "SYS.DBMS_QOPATCH", line 2124

Most of those 600MB are in the server executable (bin/oracle) and client shared library (lib/libclntsh.so):

[oracle@vmreforatun01 ~]$ size -td /u00/app/oracle/product/18EE/bin/* /u00/app/oracle/product/18EE/lib/* | sort -n
 
text data bss dec hex filename
2423 780 48 3251 cb3 /u00/app/oracle/product/18EE/lib/libofs.so
4684 644 48 5376 1500 /u00/app/oracle/product/18EE/lib/libskgxn2.so
5301 732 48 6081 17c1 /u00/app/oracle/product/18EE/lib/libodm18.so
10806 2304 1144 14254 37ae /u00/app/oracle/product/18EE/bin/sqlplus
13993 2800 1136 17929 4609 /u00/app/oracle/product/18EE/bin/adrci
46456 3008 160 49624 c1d8 /u00/app/oracle/product/18EE/lib/libnque18.so
74314 4824 1248 80386 13a02 /u00/app/oracle/product/18EE/bin/oradism
86396 23968 1144 111508 1b394 /u00/app/oracle/product/18EE/bin/lsnrctl
115523 2196 48 117767 1cc07 /u00/app/oracle/product/18EE/lib/libocrutl18.so
144591 3032 160 147783 24147 /u00/app/oracle/product/18EE/lib/libdbcfg18.so
216972 2564 48 219584 359c0 /u00/app/oracle/product/18EE/lib/libclsra18.so
270692 13008 160 283860 454d4 /u00/app/oracle/product/18EE/lib/libskjcx18.so
321701 5024 352 327077 4fda5 /u00/app/oracle/product/18EE/lib/libons.so
373988 7096 9536 390620 5f5dc /u00/app/oracle/product/18EE/lib/libmql1.so
717398 23224 110088 850710 cfb16 /u00/app/oracle/product/18EE/bin/orabaseconfig
717398 23224 110088 850710 cfb16 /u00/app/oracle/product/18EE/bin/orabasehome
878351 36800 1144 916295 dfb47 /u00/app/oracle/product/18EE/bin/tnslsnr
928382 108920 512 1037814 fd5f6 /u00/app/oracle/product/18EE/lib/libcell18.so
940122 56176 2376 998674 f3d12 /u00/app/oracle/product/18EE/lib/libsqlplus.so
1118019 16156 48 1134223 114e8f /u00/app/oracle/product/18EE/lib/libocr18.so
1128954 5936 160 1135050 1151ca /u00/app/oracle/product/18EE/lib/libskgxp18.so
1376814 18548 48 1395410 154ad2 /u00/app/oracle/product/18EE/lib/libocrb18.so
1685576 130464 160 1816200 1bb688 /u00/app/oracle/product/18EE/lib/libasmclntsh18.so
2517125 16496 15584 2549205 26e5d5 /u00/app/oracle/product/18EE/lib/libipc1.so
3916867 86504 111912 4115283 3ecb53 /u00/app/oracle/product/18EE/lib/libclntshcore.so.18.1
4160241 26320 69264 4255825 40f051 /u00/app/oracle/product/18EE/lib/libmkl_rt.so
5120001 459984 7784 5587769 554339 /u00/app/oracle/product/18EE/lib/libnnz18.so
10822468 302312 21752 11146532 aa1524 /u00/app/oracle/product/18EE/lib/libhasgen18.so
11747579 135320 160 11883059 b55233 /u00/app/oracle/product/18EE/lib/libshpkavx218.so
61758209 2520896 134808 64413913 3d6e0d9 /u00/app/oracle/product/18EE/lib/libclntsh.so.18.1
376147897 3067672 602776 379818345 16a39169 /u00/app/oracle/product/18EE/bin/oracle
487369241 7106932 1203944 495680117 1d8b7a75 (TOTALS)

Of course, this is probably not sufficient, especially if you want to run APEX, OJVM, OracleText. The method is there: run a workload that covers everything you need, and build the Oracle Home from the files used there. I used strace here, but auditd can also be a good idea. Ideally, this job will be done one day by Oracle itself in a supported way, so that we can build a core container for Oracle Database and add features as Dockerfile layers. This had be done to release Oracle XE 11g which is 300MB only. However Oracle XE 18c announced for October will probably be larger as it includes nearly all option.

 

Cet article The size of Oracle Home: from 9GB to 600MB est apparu en premier sur Blog dbi services.

A tribute to Natural Join

$
0
0

By Franck Pachot

.
I know that lot of people are against the ANSI join syntax in Oracle. And this goes beyond the limits when talking about NATURAL JOIN. But I like them and use them quite often.

Why is Natural Join bad?

Natural join is bad because it relies on column names, and, at the time of writing the query, you don’t know which columns will be added or removed later. Here is an example on the SCOTT schema, joining on DEPTNO which has the same name in DEPT and EMP:

SQL> select * from EMP natural join DEPT where DNAME='SALES';
 
DEPTNO EMPNO ENAME JOB MGR HIREDATE SAL COMM DNAME LOC
---------- ---------- ---------- --------- ---------- --------- ---------- ---------- -------------- -------------
30 7521 WARD SALESMAN 7698 22-FEB-81 1250 500 SALES CHICAGO
30 7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 SALES CHICAGO
30 7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 SALES CHICAGO
30 7900 JAMES CLERK 7698 03-DEC-81 950 SALES CHICAGO
30 7698 BLAKE MANAGER 7839 01-MAY-81 2850 SALES CHICAGO
30 7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 SALES CHICAGO

The DEPT table has a ‘LOC’column for the location of the department. But the data model may evolve and you may add a location for each employee. And we may also call it LOC:

SQL> alter table EMP add (LOC varchar2(10));
Table altered.

But now our Natural Join adds this column to the join predicate and the result is wrong because it shows only rows which have same department location as employee location:

SQL> select * from EMP natural join DEPT where DNAME='SALES';
 
no rows selected

Projection

In my opinion, the problem is not the Natural Join. Column names have a meaning for their tables. But the tables have different roles in our queries. As soon as a table or view participates to our query, we should redefine the column names. If we don’t, the result is completely wrong as:

SQL> select * from EMP join DEPT using(DEPTNO) where DNAME='SALES';
 
DEPTNO EMPNO ENAME JOB MGR HIREDATE SAL COMM LOC DNAME LOC
---------- ---------- ---------- --------- ---------- --------- ---------- ---------- ---------- -------------- -------------
30 7521 WARD SALESMAN 7698 22-FEB-81 1250 500 SALES CHICAGO
30 7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 SALES CHICAGO
30 7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 SALES CHICAGO
30 7900 JAMES CLERK 7698 03-DEC-81 950 SALES CHICAGO
30 7698 BLAKE MANAGER 7839 01-MAY-81 2850 SALES CHICAGO
30 7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 SALES CHICAGO

Look: the result has two columns with the same name. This is completely wrong for a relational database and I don’t even understand why this parses without raising an error.

The projection is the most important relational operation, often overlooked as if it was just a rename for aesthetic purpose. You need to name the columns of your result set. They are the metadata for the interface between SQL and the host language. ‘select *’ is a shortcut when running an interactive query, to get a glance at the result rows. But a SQL query result is not complete without proper column names. And in most cases, at least when you query more than one table, the name of the query result columns should be different than the name of the underlying table columns. A department may have a location. And an employee may have a location. But the location of the employee department is something completely different than the employee location.

Then, as you need to name each column anyway, why not doing it as soon as possible? Do it for each table involved in the query, so that you are sure that all column names are correct within the query. As soon as you introduce an new table in the FROM clause, you should actually name the columns according to their role in the query. Let’s take an example with an airline data model. Each airport is linked to a city. This can be a CITY column in the AIRPORTS table. But as soon as you join FLIGHTS with AIRPORTS, this table has a different role. You join on destination airport or source airport. Then you alias the AIRPORTS table in the FROM clause, such as DST_AIRPORTS or SRC_AIRPORTS. Within the query, you can reference the columns with the table alias, such as DST_AIRPORTS.CITY or SRC_AIRPORTS.CITY but this cannot be exposed as-is in the query result. You must name them in the SELECT clause with something like SELECT DST_AIRPORTS.CITY as DST_ARP_CITY , SRC_AIRPORTS.CITY as SRC_ARP_CITY.

Then, as I’ll need to rename them anyway, I prefer to do it as soon as I join to a new table in the FROM clause. Instead of joining to AIRPORTS DST_AIRPORTS I can join to (SELECT IATA DST_ARP_IATA, CITY DST_ARP_CITY FROM AIRPORTS) and all column names will relate to the role without table aliases and without further renaming. And when I do that correctly, I can use natural join without risk.

Projection in the FROM clause

Let’s take an example. Here is a query in DEPT where I explicitly mention that LOC is the department location. This is implicit when the column name belongs to the DEPT table. But it will not be implicit anymore once I join this table to another table. Here is the view ready to be included in any query:


SQL> select DEPTNO,DNAME DEPT_DNAME,LOC DEPT_LOC from DEPT where DNAME='SALES';
 
DEPTNO DEPT_DNAME DEPT_LOC
---------- -------------- -------------
30 SALES CHICAGO

Now, I can join this to the EMP table. I prefix all columns from EMP with “EMP_” and all columns from DEPT with “EMP_DEPT_” because they belong to DEPT when in the role of employee department:

SQL> select EMP_EMPNO,EMP_ENAME,EMP_DEPT_DNAME,EMP_DEPT_LOC,EMP_LOC,EMP_MGR_EMPNO
from
(select DEPTNO EMP_DEPTNO,EMPNO EMP_EMPNO,ENAME EMP_ENAME,MGR EMP_MGR_EMPNO,LOC EMP_LOC from EMP)
natural join
(select DEPTNO EMP_DEPTNO,DNAME EMP_DEPT_DNAME,LOC EMP_DEPT_LOC from DEPT)
where EMP_DEPT_DNAME='SALES';
 
EMP_EMPNO EMP_ENAME EMP_DEPT_DNAME EMP_DEPT_LOC EMP_LOC EMP_MGR_EMPNO
---------- ---------- -------------- ------------- ---------- -------------
7521 WARD SALES CHICAGO 7698
7844 TURNER SALES CHICAGO 7698
7499 ALLEN SALES CHICAGO 7698
7900 JAMES SALES CHICAGO 7698
7698 BLAKE SALES CHICAGO 7839
7654 MARTIN SALES CHICAGO 7698

As you can see, when the names are clearly indicating the column with its role in the join, and how they are correlated with the other tables, there is no need to mention any join predicate. I used Natural Join because the join is on EMP_DEPTNO and I’m sure that it will always be the one and only one column with the same name. By query design.

And the column names in the result are correct, explicitly mentioning what is an Employee attribute or an Employee department attribute. That can be easy to parse and put in an object graph in the host language. You can see there that the MGR column of EMP was named EMP_MGR_EMPNO because this is actually what it is: the EMPNO of the employee manager. It is a foreign key to the EMP table.

And then, adding more information about the manager is easy: join with EMP again but with the proper projection of columns: EMPNO will be EMP_MGR_EMPNO when in the role of the employee manager, ENAME will be EMP_MGR_ENAME, DEPTNO will be EMP_MGR_DEPTNO, and so on:


SQL> select EMP_EMPNO,EMP_ENAME,EMP_DEPT_DNAME,EMP_DEPT_LOC,EMP_LOC,EMP_MGR_DEPTNO,EMP_MGR_ENAME
from
(select DEPTNO EMP_DEPTNO,EMPNO EMP_EMPNO,ENAME EMP_ENAME,MGR EMP_MGR_EMPNO,LOC EMP_LOC from EMP)
natural join
(select DEPTNO EMP_DEPTNO,DNAME EMP_DEPT_DNAME,LOC EMP_DEPT_LOC from DEPT)
natural join
(select DEPTNO EMP_MGR_DEPTNO,EMPNO EMP_MGR_EMPNO,ENAME EMP_MGR_ENAME from EMP)
where EMP_DEPT_DNAME='SALES';
 
EMP_EMPNO EMP_ENAME EMP_DEPT_DNAME EMP_DEPT_LOC EMP_LOC EMP_MGR_DEPTNO EMP_MGR_ENAME
---------- ---------- -------------- ------------- ---------- -------------- -------------
7900 JAMES SALES CHICAGO 30 BLAKE
7499 ALLEN SALES CHICAGO 30 BLAKE
7654 MARTIN SALES CHICAGO 30 BLAKE
7844 TURNER SALES CHICAGO 30 BLAKE
7521 WARD SALES CHICAGO 30 BLAKE
7698 BLAKE SALES CHICAGO 10 KING

No need to review the whole query when adding a new table. No need to solve the new ‘column ambiguously defined’. We don’t even need to alias the tables here.

Want to add the department name of the manager? That’s easy: join to DEPT with the right column projection (all prefixed by EMP_MGR_DEPT as the new columns are all about the employee manager’s department):

SQL> select EMP_EMPNO,EMP_ENAME,EMP_DEPT_DNAME,EMP_DEPT_LOC,EMP_LOC,EMP_MGR_DEPTNO,EMP_MGR_ENAME,EMP_MGR_DEPT_DNAME
from
(select DEPTNO EMP_DEPTNO,EMPNO EMP_EMPNO,ENAME EMP_ENAME,MGR EMP_MGR_EMPNO,LOC EMP_LOC from EMP)
natural join
(select DEPTNO EMP_DEPTNO,DNAME EMP_DEPT_DNAME,LOC EMP_DEPT_LOC from DEPT)
natural join
(select DEPTNO EMP_MGR_DEPTNO,EMPNO EMP_MGR_EMPNO,ENAME EMP_MGR_ENAME from EMP)
natural join
(select DEPTNO EMP_MGR_DEPTNO,DNAME EMP_MGR_DEPT_DNAME,LOC EMP_MGR_DEPT_LOC from DEPT)
where EMP_DEPT_DNAME='SALES';
 
EMP_EMPNO EMP_ENAME EMP_DEPT_DNAME EMP_DEPT_LOC EMP_LOC EMP_MGR_DEPTNO EMP_MGR_EN EMP_MGR_DEPT_D
---------- ---------- -------------- ------------- ---------- -------------- ---------- --------------
7698 BLAKE SALES CHICAGO 10 KING ACCOUNTING
7900 JAMES SALES CHICAGO 30 BLAKE SALES
7499 ALLEN SALES CHICAGO 30 BLAKE SALES
7654 MARTIN SALES CHICAGO 30 BLAKE SALES
7844 TURNER SALES CHICAGO 30 BLAKE SALES
7521 WARD SALES CHICAGO 30 BLAKE SALES

This can be even easier when you generate SQL queries. When adding a new table to join to, you just prefix all columns with their role. Check foreign keys so that the naming is consistent with the referenced tables. Then when parsing the result, the naming convention can help to break on the object hierarchy.

Additional notes

I mentioned that aliasing the subquery is not mandatory because I do not have to prefix the column names. However, when looking at the predicates section of the execution plan, the columns may be prefixed with an internal alias:

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("from$_subquery$_006"."EMP_MGR_DEPTNO"="from$_subquery$_009"."EMP_MGR_DEPTNO")
2 - access("from$_subquery$_001"."EMP_MGR_EMPNO"="from$_subquery$_006"."EMP_MGR_EMPNO" AND "from$_subquery$_001"."EMP_DEPTNO"="from$_subquery$_003"."EMP_DEPTNO")

Then it is a good idea to add prefixes, such as EMP, EMP_DEPT, EMP_MGR EMP_MGR_DEPTNO in the query above so that the predicates become:

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("EMP_MGR"."EMP_MGR_DEPTNO"="EMP_MGR_DEPT"."EMP_MGR_DEPTNO")
2 - access("EMP"."EMP_MGR_EMPNO"="EMP_MGR"."EMP_MGR_EMPNO" AND "EMP"."EMP_DEPTNO"="EMP_DEPT"."EMP_DEPTNO")
5 - filter("DNAME"='SALES')

I also like to add a QB_NAME hint so that I can reference easily those subqueries if I have to add some hints there. Finally, this is what I can generate for this query:


SQL> select EMP_EMPNO,EMP_ENAME,EMP_DEPT_DNAME,EMP_DEPT_LOC,EMP_LOC,EMP_MGR_DEPTNO,EMP_MGR_ENAME,EMP_MGR_DEPT_DNAME
from
(select /*+qb_name(EMP)*/ DEPTNO EMP_DEPTNO,EMPNO EMP_EMPNO,ENAME EMP_ENAME,MGR EMP_MGR_EMPNO,LOC EMP_LOC from EMP) EMP
natural join
(select /*+qb_name(EMP_DEPT)*/ DEPTNO EMP_DEPTNO,DNAME EMP_DEPT_DNAME,LOC EMP_DEPT_LOC from DEPT) EMP_DEPT
natural join
(select /*+qb_name(EMP_MGR)*/ DEPTNO EMP_MGR_DEPTNO,EMPNO EMP_MGR_EMPNO,ENAME EMP_MGR_ENAME from EMP) EMP_MGR
natural join
(select /*+qb_name(EMP_MGR_DEPT)*/ DEPTNO EMP_MGR_DEPTNO,DNAME EMP_MGR_DEPT_DNAME,LOC EMP_MGR_DEPT_LOC from DEPT) EMP_MGR_DEPT
where EMP_DEPT_DNAME='SALES';

So what?

My goal here is not to recommend to always use natural joins. This depends on the context (ad-hoc queries, embedded ones in existing code with naming standards,…) and whether con control exactly the columns names. There are also a few bugs with ANSI joins, and natural join is not widely used, so maybe not tested a lot. But when I hear that Natural Join is bad, I want to explain the why/how/when. And one of the good sides of it is that it forces us to do the projection/rename as soon as possible and this makes the query easier to read/maintain/evolve. Of course, using natural join in that way requires that all tables are added to the FROM clause through a subquery which carefully names all columns in the SELECT clause so that the correlation with the other tables is clearly defined.

 

Cet article A tribute to Natural Join est apparu en premier sur Blog dbi services.

CPU_COUNT

$
0
0

When you have less CPU threads than the number of processes that has something to run in CPU, the OS will schedule them to share the CPU resource. Increasing the workload at that point will not increase the throughput because you have reached the capacity of your system, and response time will increase because of queuing. Actually, performance will even decrease because of the overhead of context switching when trying to share the processors.
When you don’t want the OS scheduler to do the resource sharing job, you can, and should, use Instance Caging. For sure, the database instance can do resource sharing more intelligently than the OS as it knows the kind of workload and the performance requirement of each process.

I did some tests on a 8 CPU machine running SLOB from 32 concurrent sessions, then 31, then 30,… down to the last run with 1 sessions, each for 5 minutes. This is what you see on the right-most dark green triangle here:
CaptureSLOBCPUCOUNT01
After a very short library cache contention when all 32 sessions are parsing their statements. The each run go decreasing. The dark green here is labelled as ‘CPU + CPU wait’ and is coming from ASH where all sessions are on state ‘ON CPU’ even when they are actually in the OS runqueue. Of course, I’ve only 8 CPU threads, so I cannot have 32 sessions running on CPU.

The runs on the left where you can see the same but with some light green is from same runs but with Instance Caging active. I’ve a resource manager plan set and I’ve set CPU_COUT to 8 (the first run on the left), then 7, … down to 1. The dark green is still the ‘ON CPU’ state and with Instance Caging Oracle allows at maximum CPU_COUNT processes in that state. The remaining processes are switched to a waiting state, instrumented as ‘resmgr: cpu quantum’ and displayed in light green.

My goal is to show that you can increase the throughput with Instance Caging. I measured the logical reads per second and made an Excel chart from them. The blue lines are from different CPU_COUNT settings from 8 to 1. The orange line is from no setting CPU_COUNT which means that instance caging is not enabled. On the X axes you have the number of conccurent SLOB sessions I’ve run. What you see from the bluse lines is that the throughput increases linearly with the number of concurrent session until it reaches the limit: either the CPU_COUNT limit or the physical limit when CPU_COUNT is not set. Note that the CPU threads are not cores here. Tests were done on Oracle Public Cloud 4 OCPUs (aka OC5 compute shape) which are actually 8 threads from E5-2690 v2 Intel processors. This is why running on two threads here do not double the throughput. Actually, when running 8 sessions on 8 threads the throughput is only x6 from running one session on one thread.

CaptureSLOBCPUCOUNT

The second goal is to compare Oracle instance caging with OS scheduler when instance is using full capacity of the server. On the top you can see the darker blue line which is when CPU_COUT is set to the actual number of CPU threads (CPU_COUNT=8). The orange line is when no CPU_COUNT is set: instance caging is disabled. The maximum throughput then, 3.6 MLR/s, is reached when we run same number of sessions as the number of CPU threads. What you see here is that when the server is overloaded scheduling at instance level is more efficient than scheduling at OS level. Without instance caging, the orange line, the LR/s degrades because of context switching overhead. So the recommandation here is to always do instance caging even if you have only one instance on your server.

Why is the instance caging algorithm better than the OS scheduler? Because it is focused at database processes workload. Here is the graphs of the ‘resmgr: cpu quantum’ wait times.

CaptureSLOBCPUCOUNT02

On the left, I’ve run with CPU_COUNT=8. When I have 32 concurrent sessions each of them spend 3/4 of their time waiting for CPU. Those waits are about 300 milliseconds. When I’ve only 9 sessions, each one have to spend only small part of their response time on waiting. They wait about 25 milliseconds on ‘resmgr: cpu quantum’. The wait time is not fixed and depends on the load. This makes sens: when you know you will have to spend a long time waiting, it’s better to have longer waits in order to avoid too many context switches. On the right, it’s the same but with CPU_COUNT=1 which gives x8 less CPU time to the processes. They will have to spend more time on waiting. And we see that the wait time is adjusted: can go up to 4 seconds time slices. The OS scheduler will never do that, putting a process on runqueue wait for several seconds, because the scheduler tries focus on the response time. It’s different with instance caging. When you know that you will have to spend a long time waiting, then it’s better to optimize throughput by lowering the context switching.

The recommandation is to enable instance caging: set a resource manager plan and set cpu_count. It’s not an option. There’s no additional costs for it. And it will always be better than letting the OS manager CPU starvation.

On Standard Edition 2, it’s even easier: Oracle Corp. enabled instance caging for you ;)

 

Cet article CPU_COUNT est apparu en premier sur Blog dbi services.

Viewing all 331 articles
Browse latest View live