Search This Blog

Thursday, April 7, 2016

Amazon Redshift Commands

Here are some useful Amazon Redshift commands which we can use for administrator and monitoring purpose:

  • The below command will give the info about the used disk space of redshift cluster nodes:

          select node, diskno, used, capacity, (used*1.0/capacity)*100 as                       "percent" from (select owner as node, diskno, used, capacity from                      stv_partitions order by 1, 2, 3, 4) as T;
  • To get the running queries at Redshift cluster:

           select * from stv_recents where status = 'Running';
  • To get information like disk space, rows of all the tables under specific schema (here i am using for public schema):

          select a.id,trim(pgdb.datname) as Database, trim(pgn.nspname) as                   Schema, trim(a.name) as Table, b.mbytes, a.rows 
          from (select db_id, id, name, sum(rows) as rows
          from stv_tbl_perm a group by db_id, id, name) as a
          join pg_class as pgc on pgc.oid = a.id
          join pg_namespace as pgn on pgn.oid = pgc.relnamespace
          join pg_database as pgdb on pgdb.oid = a.db_id
          join (select tbl, count(*) as mbytes from stv_blocklist
          group by tbl) b on a.id = b.tbl and lower(pgn.nspname)='public'
          order by mbytes desc, a.db_id, pgn.nspname, a.name; 

  • Distribution/Sort key, Unsorted data percentage of tables in "public" schema:

          select * from svv_table_info where schema='public';

  • Active session in Redshift cluster:

          select * from stv_sessions;

  • To find the distribution of table data among nodes:

          select name,slice,  sum(num_values) from svv_diskusage
          where lower(name) = '' and col=0 group by name,slice                   order by slice;

  • To identify and resolve database tables conflict:

         select * from stl_tr_conflict where table_id = order by xact_start_ts;

  • To find which tables data are available in the specific node:

          select tbl,name, sum(num_values) from svv_diskusage where col =0 and            slice in (0,1) group by tbl,name order by sum(num_values) desc;

note: each node is divided into number of slices. Number of slices per node depends upon the cluster size.

  • How many queries are executed till last restart in each user-defined WLM queue:

      select service_class,num_queued_queries, num_executing_queries, 
       num_executed_queries from stv_wlm_service_class_state 
       where service_class > 5;