Quantcast
Channel: The SAS Dummy
Viewing all 234 articles
Browse latest View live

Using LIBNAME XLSX to read and write Excel files

$
0
0

When you weren't watching, SAS did it again. We smuggled Yet Another Excel Engine into a SAS release.

SAS 9.4 Maintenance 2 added the XLSX engine, which allows you to read and write Microsoft Excel files as if they were data sets in a library. The big advantage of using this engine is that it accesses the XLSX file directly, and doesn't use the Microsoft data APIs as a go-between. (LIBNAME EXCEL and LIBNAME PCFILES rely on those Microsoft components.) That means that you can use this engine on Windows or Unix systems without having to worry about bitness (32-bit versus 64-bit) or setting up a separate PC Files Server process.

The XLSX engine does require a license for SAS/ACCESS to PC Files. Are you a SAS University Edition user? The SAS/ACCESS product is part of that package, so this technique works there. It's an easy way to get well-formed Excel data into your SAS process.

/* because Excel field names often have spaces */
options validvarname=any;
 
libname xl XLSX '/folders/myfolders/sas_tech_talks_15.xlsx';
 
/* discover member (DATA) names */
proc datasets lib=xl; quit;
 
libname xl CLEAR;

Example output:

xl_contents
Once the library is assigned, I can read the contents of a spreadsheet into a new SAS data set:

/* because Excel field names often have spaces */
options validvarname=any;
 
libname xl XLSX '/folders/myfolders/sas_tech_talks_15.xlsx';
 
/* read in one of the tables */
data confirmed;
  set xl.confirmed;
run;
 
libname xl CLEAR;

And here's the result in my SAS University Edition:

xl_confirmed
Sometimes you need just one value from a spreadsheet. That's a common use case for dynamic data exchange (DDE), which isn't as feasible as it once was. You can use FIRSTOBS and OBS options to control how much data you retain:

/* read in just one value */
data _null_;
  set xl.confirmed (firstobs=6 obs=6 keep='Job title'n);
  call symput('VALUE','Job Title'n);
run;
%put &value;

Output:

 76         %put &value;
 Testing Manager,  Quality-driven User Experience Testing

You can also use the XLSX engine to create and update XLSX files.

libname xlout XLSX '/folders/myfolders/samples.xlsx';
 
data xlout.cars;
  set sashelp.cars;
run;
 
data xlout.classfit;
  set sashelp.classfit;
run;
 
data xlout.baseball;
  set sashelp.baseball;
run;
 
data xlout.air;
  set sashelp.air;
run;
 
libname xlout clear;

Here is my output in Microsoft Excel with all of these data sets now as sheets:

xl_xlsxout
Remember, you can also create Microsoft Excel files with Base SAS by using ODS EXCEL -- experimental in 9.4 Maintenance 2 but production in Maintenance 3 (coming soon).

The XLSX libname is different from the EXCEL and PCFILES engines in other ways. For example, the XLSX engine does not support Excel named ranges (which can surface a portion of a spreadsheet as a discrete table). Also, you won't see the familiar "$" decoration around the spreadsheet names when they are surfaced in the library within SAS. If you need that sort of flexibility, you can use PROC IMPORT to provide more control over exactly what Excel content is brought into SAS and how.

One other IMPORTANT caution: The XLSX engine is a sequential access engine in that it processes data one record after the other. The engine starts at the beginning of the file and continues in sequence to the end of the file. Some techniques to MODIFY the data in-place will not work. Also, some SAS data viewers cannot render the data from the XLSX engine. SAS VIEWTABLE and SAS Enterprise Guide and even SAS Studio can't open these tables directly in the data grid view. VIEWTABLE gives you a nice message, but SAS Enterprise Guide simply "hangs" in the attempt. For that reason, I recommend using DATA step to copy the Excel content that you want to another SAS library, then CLEAR the XLSX library to avoid accidentally opening a table in a viewer that won't support it. (This is currently a bug in SAS Enterprise Guide that should be fixed in a future release.)

I have found LIBNAME XLSX to be a quick, convenient method to bring in Excel data on any SAS platform. If you have SAS 9.4 Maintenance 2 or later, try it out! Let me know how it works for you by sharing a comment here.

tags: excel, ODS EXCEL, SAS 9.4, SAS University Edition, xlsx

The post Using LIBNAME XLSX to read and write Excel files appeared first on The SAS Dummy.


Ask the Expert: Creating custom tasks for SAS Enterprise Guide

$
0
0

If you have not yet discovered the new Ask the Expert series on the SAS Training site, you are missing out on a treasure. Visit the site right now and review all of the available topics, from "Newbie" to Analytics to Visualization to good ol' SAS programming. Go on; I'll wait.

Ask the Expert - they wear glasses, apparently
Welcome back! Amazing, right? You can get lost for hours learning new stuff or just reviewing what you thought you already knew. Some topics are available as live sessions that you can "attend" as they happen, but many of them are available on-demand, for free, no strings attached.

You might have noticed that I have a humble contribution in the "expert" collection: Developing Custom Tasks for SAS Enterprise Guide. During the 37-minute video, I lead you through the uses of SAS custom tasks and the basic steps for creating your own task. You'll learn what custom tasks can do and what they cannot do. You'll learn about the tools and APIs that support the creation of tasks. And you'll see the "inside" of a completed custom task project. (An aside: I owe a big "thank you" to the team that post-processed the recording of my expert talk -- I know that I wasn't that smooth and concise when I recorded it!)

It's a short introduction and watching it won't make you an expert in the topic, but it will help you to decide whether to learn more. You can learn more from my book on this topic, or you can arrange to attend an offering of the two-day course that we offer occasionally. Or you can learn it all on your own, as many have. There are plenty of examples and references to work from. If you're wondering what skills you should have before taking the class, watch my "about this course" video here.

In the "ask the expert" video, I referenced a collection of API libraries that make it easier to set up your custom task projects. These API libraries are available for each version of SAS Enterprise Guide, and they allow you to create tasks that are compatible with multiple versions of the SAS applications, even if you do not have those particular versions installed. (You still need at least one version of SAS Enterprise Guide or SAS Add-In for Microsoft Office in order to test and run your custom task.)

With the permission of the SAS R&D developers, I have made those libraries available here:

>> Download custom task API libraries (ZIP file 333KB)

The README.txt file in the ZIP file explains how to use the libraries.

If you have questions as you start your custom task adventures or if you just want to brag about your successes, post back here or on the SAS Enterprise Guide community. I'd love to hear from you!

tags: SAS custom tasks, SAS Enterprise Guide, sas training

The post Ask the Expert: Creating custom tasks for SAS Enterprise Guide appeared first on The SAS Dummy.

Copy an entire process flow in SAS Enterprise Guide

$
0
0

I've seen some crazy process flows in SAS Enterprise Guide. Crazy-big, and crazy-complex, used by real customers to accomplish real work. But while these process flows represent a ton of work, this is usually a calculated investment to automate processes that would be difficult to capture in another way.

For years, SAS Enterprise Guide users have asked for a way to reuse their process flows in new projects. You have always been able to copy-and-paste individual items (tasks, queries, programs) from one project to another, or from one flow to another in the same project. But when you tried to copy a collection of items from a flow, everything fell apart when you pasted into the new destination. The links/relationships among the objects were not retained, and that sabotaged your goal of saving time and effort.

In SAS Enterprise Guide 7.1, this is finally improved. You can now copy an entire flow, or multiple selected items from a flow, and paste that content intact into another project or process flow.

To copy an entire flow, right-click on the flow name within the Project Tree, then select Copy. In your destination project, right-click on an empty spot within the Project Tree and you'll see that Paste is enabled. When you paste, you'll see the entire flow transfer over, including the links and layout. It's like magic.

copypaste
To copy just a portion of the process flow, click-drag the cursor to "rubberband" a selection of connected items. (You can also use Ctrl+click to select the items that you want to include.) With the items selected, right-click on one of the selected items and choose Copy. You can then Paste the content into a new or existing process flow.

copyselect
You can paste a process flow into a new or an existing SAS Enterprise Guide project. Try it for yourself -- see how much time it can save for you!

Note: this feature was added in SAS Enterprise Guide 7.1. The first update (7.11) improved it further (such as capturing project prompts that are referenced in your flow).

tags: process flows, SAS Enterprise Guide

The post Copy an entire process flow in SAS Enterprise Guide appeared first on The SAS Dummy.

SAS Enterprise Guide now updates itself

$
0
0

I returned to work from a 2+ week vacation this morning. When I fired up SAS Enterprise Guide (as I do each work day and occasionally on weekends), I was greeted with this message:

An update to SAS Enterprise Guide is available!
As a SAS insider, I knew this was coming. It's a new feature that was added in SAS Enterprise Guide 7.11. I intended to write a blog about this before now -- but then I went on vacation instead.

I'm a trusting fellow, but I still clicked on the link in the message to learn more information about the update. All of the improvements seemed good to me, so I clicked Close and Install.

My SAS Enterprise Guide session closed and a moment later I saw the patch being applied:

Applying patch
When complete, I was greeted with this good news:

Your software is now up to date!
And when I clicked Finish and the application restarted, I checked the Help->About SAS Enterprise Guide window to see that the update was in place.

about window with HF number
I think that this "automatic update" is a tremendous feature whose time has come (if it's not overdue). However, not everyone will want to update their software on SAS' schedule. You can defer the update with Remind me later, or select Skip this version in order to not be reminded again (until the next update). You can always check for updates from the Help menu.

tags: auto update, SAS Enterprise Guide

The post SAS Enterprise Guide now updates itself appeared first on The SAS Dummy.

Using Lua within your SAS programs

$
0
0

With apologies to this candy advertisement from the 1980s:

"Hey, you got your Lua in my SAS program."
"You got your SAS code in my Lua program!"

Announcer: "PROC LUA: Two great programming languages that program great together!"

What is Lua? It's an embeddable scripting language that is often used as a way to add user extensions to robust software applications. Lua has been embedded into SAS for some time already, as it's the basis for new ODS destinations like EXCEL and POWERPOINT. But SAS users haven't had a way to access it.

With SAS 9.4 Maintenance 3 (released July 2015), you can now run Lua code in the new LUA procedure. And from within that Lua code, you can exchange data with SAS and call SAS functions and submit SAS statements. (Running SAS within Lua within SAS -- it's just like Inception.)

Paul Tomas, the developer for PROC LUA, presented a demo of the feature and its usefulness in a recent SAS Tech Talk:


 
Paul also wrote a paper for SAS Global Forum 2015: Driving SAS with Lua.

Like many innovations that find their way into customer-facing features, this new item was added to help SAS R&D complete work for a SAS product (specifically, the new version of SAS Forecast Server). But the general technique was so useful that we decided to add it into Base SAS as a way for you to integrate Lua logic.

PROC LUA can be an alternative to the SAS macro language for injecting logical control into your SAS programs. For example, here's a sample program that generates a SAS data set only if the data set doesn't already exist.

proc lua ;
submit; 
 
-- example of logic control within LUA
if not sas.exists("work.sample") then
    print "Creating new WORK.SAMPLE"
	sas.submit [[
	  data work.sample;
	    set sashelp.class;
	  run;
	 ]]
   else print "WORK.SAMPLE already exists"
 end
 
endsubmit;
run;

First run:

NOTE: Lua initialized.
Creating new WORK.SAMPLE
    data work.sample;
      set sashelp.class;
    run;

And subsequent runs:

NOTE: Resuming Lua state from previous PROC LUA invocation.
WORK.SAMPLE already exists
NOTE: PROCEDURE LUA used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

Unlike other embedded languages (like PROC GROOVY), Lua runs in the same process as SAS -- and not in a separate virtual machine process like a Java VM. This makes it easy to exchange information such as data and macro variables between your SAS and Lua programming structures.

If you have SAS 9.4M3 and have time to play with PROC LUA, let us know what interesting applications you come up with!

tags: Lua, SAS 9.4

The post Using Lua within your SAS programs appeared first on The SAS Dummy.

Copy data and column names from SAS Enterprise Guide

$
0
0

While I've often written about how to get your SAS data to Microsoft Excel in some automated way, I haven't really addressed what's probably the most frequently used method: copy and paste. SAS Enterprise Guide 7.1 added a nifty little feature that makes copy-and-paste even more useful.

The new "Copy with headers" feature creates a tab-delimited version of your selected data cells, complete with a heading row that includes the column names. This is very convenient for copying into a new Microsoft Excel spreadsheet or other table structure (like a Google Docs spreadsheet). Note: this is different than the "copy data attributes" tip that I published a while back. That tip captures the column properties, but not the actual data values.

To get started, simply select the data that you want within the SAS Enterprise Guide data grid. The data cells must be contiguous, but you don't need to select all columns or all cells within the data set. With the selection active, right-click and select Copy with headers.

copy with headers
This action places the data values in tab-delimited form onto the Windows clipboard. The first line of the data will be the SAS variable names from your selected data. If you paste this into a text editor that shows a view of "special characters", you can see the tabs along with the end-of-line delimiters.

tab delimited content
When you paste the same content into Microsoft Excel, the Excel application knows how to automatically distribute these values into distinct columns. That's just what spreadsheet programs do.

paste into Excel
Microsoft Excel isn't the only app that can handle tab-delimited data in this way. You can paste the content into a Google Doc spreadsheet too.

paste into Google Docs
Or, you could simply paste into a text file as-is, and then save that file to read into a SAS program at a later time, bringing it full circle.

tags: excel, SAS Enterprise Guide

The post Copy data and column names from SAS Enterprise Guide appeared first on The SAS Dummy.

Why should we teach Roman numerals?

$
0
0

In my local paper this morning, I read about how a North Carolina state commission plans to recommend changes to our teaching standards for mathematics. One of the topics that they want to bring back: Roman numerals. Why? According to my exhaustive 30 seconds of Internet research, the only practical applications of Roman numerals are: I) understanding Super Bowl numbering, and II) reading the time on old-fashion clocks.

But I don't need convincing. I believe that there are other advantages of teaching Roman numerals. The main lesson is this: the world has not always revolved around "base 10" numbering, and actually it still doesn't today. Having the ability to express numbers in other forms helps us to understand history, passage of time, technology, and even philosophy*.

In the popular media, binary (base 2) is famous for being "the language of computers". That may be so, but binary is not usually the language of computer programmers. When I was a kid, I spent many hours programming graphics on my TI 99/4A computer. I became proficient in translating decimal to hexadecimal (base 16) to binary -- all to express how the pixels would be drawn on the screen and in what color. Due to lack of practice and today's availability of handy tools and higher-level programming languages, I have since lost the ability to calculate all of these in my head. I also lost the ability to solve any Rubik's Cube that I pick up -- there go all of my party tricks.

But the SAS programming language retains many fun math tricks, including the ability to express numbers in many different ways, instantly. Here's an example of one number expressed six (or 6 or VI or 0110) different ways.

data _null_;
  x = 1956;
  put  / 'Decimal: '     x=best12.;
  put  / 'Roman: '       x=roman10.;
  put  / 'Word: '        x=words50.;
  put  / 'Binary: '      x=binary20.;
  put  / 'Octal: '       x=octal10.;
  put  / 'Hexadecimal: ' x=hex6.;
run;

The output:

Decimal: x=1956
Roman: x=MCMLVI
Word: x=one thousand nine hundred fifty-six
Binary: x=00000000011110100100
Octal: x=0000003644
Hexadecimal: x=0007A4

You might never need some of these number systems or SAS formats in your job, but knowing them makes you a more interesting person. If nothing else, it's a skill that you can trot out during cocktail parties. (I guess I attend different sorts of parties now.)

* For example, the number 'zero' has not always been with us. Introducing it into our numbering system allows us to think about 'nothing' in ways that earlier societies could not.

tags: formats, stem

The post Why should we teach Roman numerals? appeared first on The SAS Dummy.

Using SAS DS2 to parse JSON

$
0
0

Thanks to the proliferation of cloud services and REST-based APIs, SAS users have been making use of PROC HTTP calls (to query these web services) and some creative DATA step or PROC GROOVY code to process the JSON results. Such methods get the job done (JSON is simply text, after all), but they aren't as robust as an official JSON parser. JSON is simple: it's a series of name-value pairs that represent an object in JavaScript. But these pairs can be nested within one another, so in order to parse the result you need to know about the object structure. A parser helps with the process, but you still need to know the semantics of any JSON response.

SAS 9.4 introduced PROC JSON, which allows you to create JSON output from a data set. But it wasn't until SAS 9.4 Maintenance 3 that we have a built-in method to parse JSON content. This method was added as a DS2 package: the JSON package.

I created an example of the method working -- using an API that powers our SAS Support Communities! The example queries communities.sas.com for the most recent posts to the SAS Programming category. Here's a small excerpt of the JSON response.

 "post_time": "2015-09-28T16:29:05+00:00",
  "views": {
  "count": 1
  },
  "subject": "Re: How to code for the consecutive values",
  "author": {
  "href": "\/users\/id\/13884",
  "login": "ballardw"

Notice that some items, such as post_time, are simple one-level values. But other items, such as views or author, require a deeper dive to retrieve the value of interest ("count" for views, and "login" for author). The DS2 JSON parser can help you to navigate to those values without you needing to know how many braces or colons or commas are in your way.

Here is an example of the result: a series plot from PROC SGPLOT and a one-way frequency analysis from PROC FREQ. The program also produces a detailed listing of the messages, the topic content, and the datetime stamp.

series

boardfreq
This is my first real DS2 program, so I'm open to feedback. I already know of a couple of improvements I should make, but I want to share it now as I think it's good enough to help others who are looking to do something similar.

The program requires SAS 9.4 Maintenance 3. It also works fine in the most recent version of SAS University Edition (using SAS Studio 3.4). All of the code runs using just Base SAS procedures.

/* DS2 program that uses a REST-based API */
/* Uses http package for API calls       */
/* and the JSON package (new in 9.4m3)   */
/* to parse the result.                  */
proc ds2; 
  data messages (overwrite=yes);
    /* Global package references */
    dcl package json j();
 
    /* Keeping these variables for output */
    dcl double post_date having format datetime20.;
    dcl int views;
    dcl nvarchar(128) subject author board;
 
    /* these are temp variables */
    dcl varchar(65534) character set utf8 response;
    dcl int rc;
    drop response rc;
 
    method parseMessages();
      dcl int tokenType parseFlags;
      dcl nvarchar(128) token;
      rc=0;
      * iterate over all message entries;
      do while (rc=0);
        j.getNextToken( rc, token, tokenType, parseFlags);
 
        * subject line;
        if (token eq 'subject') then
          do;
            j.getNextToken( rc, token, tokenType, parseFlags);
            subject=token;
          end;
 
        * board URL, nested in an href label;
        if (token eq 'board') then
          do;
            do while (token ne 'href');
               j.getNextToken( rc, token, tokenType, parseFlags );
            end;
            j.getNextToken( rc, token, tokenType, parseFlags );
            board=token;
          end;
 
        * number of views (int), nested in a count label ;
        if (token eq 'views') then
          do;
            do while (token ne 'count');
               j.getNextToken( rc, token, tokenType, parseFlags );
            end;
            j.getNextToken( rc, token, tokenType, parseFlags );
            views=inputn(token,'5.');
          end;
 
        * date-time of message (input/convert to SAS date) ;
        * format from API: 2015-09-28T10:16:01+00:00 ;
        if (token eq 'post_time') then
          do;
            j.getNextToken( rc, token, tokenType, parseFlags );
            post_date=inputn(token,'anydtdtm26.');
          end;
 
        * user name of author, nested in a login label;
        if (token eq 'author') then
          do; 
            do while (token ne 'login');
               j.getNextToken( rc, token, tokenType, parseFlags );
            end;
            * get the author login (username) value;
            j.getNextToken( rc, token, tokenType, parseFlags );
            author=token;
            output;
          end;
      end;
      return;
    end;
 
    method init();
      dcl package http webQuery();
      dcl int rc tokenType parseFlags;
      dcl nvarchar(128) token;
      dcl integer i rc;
 
      /* create a GET call to the API                                         */
      /* 'sas_programming' covers all SAS programming topics from communities */
      webQuery.createGetMethod(
         'http://communities.sas.com/kntur85557/' || 
         'restapi/vc/categories/id/sas_programming/posts/recent' ||
         '?restapi.response_format=json' ||
         '&restapi.response_style=-types,-null&page_size=100');
      /* execute the GET */
      webQuery.executeMethod();
      /* retrieve the response body as a string */
      webQuery.getResponseBodyAsString(response, rc);
      rc = j.createParser( response );
      do while (rc = 0);
        j.getNextToken( rc, token, tokenType, parseFlags);
        if (token = 'message') then
          parseMessages();
      end;
    end;
 
  method term();
    rc = j.destroyParser();
  end;
 
  enddata;
run;
quit;
 
/* Add some basic reporting */
proc freq data=messages noprint;
    format post_date datetime11.;
    table post_date / out=message_times;
run;
 
ods graphics / width=2000 height=600;
title '100 recent message contributions in SAS Programming';
title2 'Time in GMT';
proc sgplot data=message_times;
    series x=post_date y=count;
    xaxis minor label='Messages';
    yaxis label='Time created' grid;
run;
 
title 'Board frequency for recent 100 messages';
proc freq data=messages order=freq;
    table board;
run;
 
title 'Detailed listing of messages';
proc print data=messages;
run;
 
title;

I also shared this program on the SAS Support Communities as a discussion topic. If you want to contribute to the effort, please leave me a reply with your suggestions and improvements!

tags: DS2, JSON, REST API, SAS 9.4

The post Using SAS DS2 to parse JSON appeared first on The SAS Dummy.


The famous SAS cowboy hat now fits all SAS users

$
0
0

cbhat_sgRick Wicklin created a nice example of using the SURFACEPLOTPARM statement to create a surface plot in SAS. As I read it, the question that immediately came to mind was: can I use this to create the famous SAS cowboy hat?

The "cowboy hat" is a highly distributed example of using PROC G3D to create a 3-dimensional rendering of data that resembles...well...a cowboy hat. PROC G3D is a cool SAS proc, but it's part of the SAS/GRAPH product and not everyone has access to that. For example, users of the SAS University Edition cannot run PROC G3D or any SAS/GRAPH programs. But the SG procedures, including SGPLOT and SGRENDER, are built into Base SAS.

Now we can bring the cowboy hat to the next generation of SAS users. Without further ado, here is the SAS program that build the hat. The program works in SAS Display Manager, SAS Enterprise Guide, and SAS University Edition.

/* Graph Template Language that defines the graph layout    */
/* This needs to be run just once within your SAS session   */
/* From Rick's post at:                                     */
/* http://blogs.sas.com/content/iml/create-surface-plot-sas */
proc template;                        /* surface plot with continuous color ramp */
define statgraph SurfaceTmplt;
dynamic _X _Y _Z _Title;              /* dynamic variables */
 begingraph;
 entrytitle _Title;                   /* specify title at run time (optional) */
  layout overlay3d;
    surfaceplotparm x=_X y=_Y z=_Z /  /* specify variables at run time */
       name="surface" 
       surfacetype=fill
       colormodel=threecolorramp      /* or =twocolorramp */
       colorresponse=_Z;
    continuouslegend "surface";
  endlayout;
endgraph;
end;
run;
 
/* DATA step to create the "hat" data */
data hat; 
 do x = -5 to 5 by .5;
  do y = -5 to 5 by .5;
   z = sin(sqrt(y*y + x*x));
   output;
  end;
 end;
run;	
 
ods graphics / width=1000 height=800;
 
/* And... Render the Hat! */
proc sgrender data=hat template=SurfaceTmplt; 
   dynamic _X='X' _Y='Y' _Z='Z' _Title="Howdy Pardner!";
run;
tags: ODS Graphics

The post The famous SAS cowboy hat now fits all SAS users appeared first on The SAS Dummy.

Copy SAS variable names to the clipboard in SAS Enterprise Guide

$
0
0

I recently met SAS user "CSC" at the Analytics 2015 conference. It might be generous to say that he's an avid user of SAS Enterprise Guide; it's probably more accurate to say that he's now accustomed to the tool and he's once again productive. But he still misses some features from his PC SAS days, including this one.

He wants to be able to copy just a list of SAS variables names from a SAS data set, so that he can then paste them into a SAS program (or another document). In PC SAS he had a simple GSUBMIT sequence that captured the names and "copied" them to the Windows clipboard with FILENAME CLIPBRD. That does not work in SAS Enterprise Guide, because SAS doesn't have direct access to the clipboard on your local machine.

CSC posted his question to the SAS Enterprise Guide community, and Tom suggested that a custom task might help. Good answer, but there it sat until CSC and I met in person this week in Las Vegas. After a short discussion and a personal plea, I was able to create the task in about 30 minutes.

copycolsmenu
Actually, it's three tasks, to cover three variations of the "Paste" operation. One supports a CSV-style, another supports CSV over multiple lines, and a third produces just a straight list on separate lines with no commas.

threecopytasks

You can download and try this custom task too. It works with SAS Enterprise Guide 4.3 and later. Download the task from the SAS support site as a ZIP file. The instructions for installation and use are in the README.txt in the ZIP file.

Related articles

tags: SAS Communities, SAS custom tasks, SAS Enterprise Guide

The post Copy SAS variable names to the clipboard in SAS Enterprise Guide appeared first on The SAS Dummy.

Using the ODS statement to add layers in your ODS sandwich

$
0
0

The ODS statement controls most aspects of how SAS creates your output results. You use it to specify the destination type (HTML, PDF, RTF, EXCEL or something else), as well as the details of those destinations: file paths, appearance styles, graphics behaviors, and more. The most common use pattern is the "ODS sandwich." In this pattern, you open the destination with the ODS statement, then include all of the code that generates the substance of the output, and then use an ODS CLOSE statement to finish it off. Here's a classic example:

ods html file="c:\project\myout.html" /* top slice of bread */
  style=journal gpath="c:\project";
  proc means data=sashelp.class;      /* the "meat" */
  run;
 
  proc sgplot data=sashelp.class;
  histogram weight;
  run;
ods html close;                       /* bottom slice */

But did you know that you can insert more ODS statements to adjust ODS behavior midstream? These allow you to use a variety of ODS behaviors within a single result. You can create your own "Dagwood sandwich" version of SAS output! For cultural reference:

Dagwood sandwich: A Dagwood is a tall, multi-layered sandwich made with a variety of meats, cheeses, and condiments. It was named after Dagwood Bumstead, a central character in the comic strip Blondie, who is frequently illustrated making enormous sandwiches. Source: Wikipedia

Here's an example program that changes graph style and title behavior within a single ODS output file. You should be able to try this code in any SAS programming environment.

ods _all_ close;
%let outdir = %sysfunc(getoption(WORK));
ods graphics / width=400 height=400;
 
ods html(id=dagwood) file="&outdir./myout.html"
  style=journal gtitle  
  gpath="&outdir.";
 
  title "Example ODS Dagwood sandwich";
  proc means data=sashelp.class;
  run;
ods layout gridded columns=2;
ods region;
ods html(id=dagwood) style=statdoc ;  
  proc sgplot data=sashelp.class;
  title "This title is part of the graph image, Style=STATDOC";
  histogram weight;
  run;
ods region;
ods html(id=dagwood) style=raven nogtitle;
  title "This title is in the HTML, Style=RAVEN";
  proc sgplot data=sashelp.class;
  histogram height;
  run;
ods layout end;
 
ods html(id=dagwood) close;

odsexHere's the result, plus some important items to note about this technique.

  • It's a good practice to distinguish each ODS destination with an ID= value. This allows you to reference the intended ODS stream with no ambiguity. After all, you can have multiple ODS destinations open at once, even multiple destinations of the same type. In my example, I used ID=dagwood to make it obvious which destination the statement applies to.
  • You can use this technique to modify only those directives that can change "mid-file" to apply to different parts of the output. You can't modify those items that apply to the entire file, such as PATH, ENCODING, STYLESHEET and many more. These can be set just once when you create the file; setting multiple different values wouldn't make sense.

You can use this technique within those applications that generate ODS statements for you, such as SAS Enterprise Guide. For example, to modify the default SAS Enterprise Guide HTML output "midstream", add a statement like:

ods html(id=eghtml) /*... plus your options, like STYLE=*/ ;
ods html(eghtml) /* this shorthand works too */ ;

In SAS Studio or SAS University Edition, try this:
ods html5(id=web) /*... plus your options, like STYLE=*/ ;
ods html5(web) /* this shorthand works too */ ;

Example: an ODS Graphics style sampler

samplerHere's one more example that puts it all together. Have you ever wanted an easy way to check the appearance of the dozens of different built-in ODS styles? Here's a SAS macro program that you can run in SAS Enterprise Guide (with the HTML result on) that generates a "sampler" of graphs that show variations in fonts, colors, and symbols across the different styles.

This example uses ODS LAYOUT (production in SAS 9.4) to create a gridded layout of example plots. If you want to try this in SAS Studio or in SAS University Edition, you can adjust one line in the program (as noted in the code comments).

/* Run within SAS Enterprise Guide       */
/* with the HTML result option turned ON */
%macro styleSampler;
title;
proc sql noprint;
  select style into :style1-:style99  
    from sashelp.vstyle 
    where libname="SASHELP" and memname="TMPLMST";
 
  ods layout gridded columns=4;
  ods graphics / width=300 height=300;
  %do index=1 %to &sqlobs;
    ods region;
    ods html(eghtml) gtitle style=&&style&index.;
    /* In SAS Studio, use this instead: */
    /* ods html5(web) gtitle style=&&style&index.; */
    title "Style=&&style&index.";
    proc sgplot data=sashelp.class;
    scatter x=Height y=Weight /group=Sex;
    reg x=Age y=Weight / x2axis;
    run; 
  %end;
  ods layout end;
%mend;
 
%styleSampler;

See also

Take control of ODS results in SAS Enterprise Guide
Best way to suppress ODS output in SAS
Advanced ODS Graphics techniques: a new free book

tags: ods, SAS programming

The post Using the ODS statement to add layers in your ODS sandwich appeared first on The SAS Dummy.

A viral video that was 47 years in the making

$
0
0

American falls as seen from Canada in 2013

American side seen from Canada in 2013

When he filmed the scene in the summer of '69, my Dad did not foresee his moment of fame in 2016. But in the last two days, Dad has seen his 47-year-old work appear in the local Buffalo, NY media, on DailyMail.com, and on FOX News*.

In August of 1969, on a family outing to Niagara Falls, Dad filmed a remarkable scene. It was during the time that engineers had "turned off" the American side of the Falls**, diverting most of the water to the Canadian side, while scientists studied the natural wonder for erosion patterns. Did you know that it was possible to turn off the mighty Niagara Falls? Yes, it's been done. And there is renewed interest in the event because Niagara Falls authorities are talking about doing it again.

 
It might be generous to call the video "viral." By most definitions, a video can be called "viral" if it receives a million views in a day, or 3 to 5 million views in a few days. This video (on my personal YouTube channel) has received only about 40,000 views in the past day. Not viral, but let's call it "burgeoning" (thank you Roget's). (UPDATE: one week later, the video now has over 100,000 views!)

Here's the historical timeline of this video:

  • August 1969: Dad films the dewatered Niagara Falls on a common 8mm film camera. I'm in the video at the end -- that's me in the stroller (I was 1-yr old) with my Mom.
  • November 2006: as steward of the family 8mm films, I digitize the film and edit it. I added some explanatory text and a bed of music that I don't have the rights to use (hey, that makes me a citizen of the Internet).
  • April 2011: I upload the video to YouTube. I figured it would be interesting to some, as it captured a rare event. A once-in-a-lifetime event, we might have thought back then. In nearly 5 years, the video accumulates only a few thousand views.
  • January 2016: a perfect storm makes the video super popular. The conditions of this storm: a related modern story renews interest, the video contains relatively rare footage, and (maybe most importantly) the video producer (me) is available and responsive to grant permission to these media outlets.

That last point was probably crucial. In all three cases (Buffalo News, Daily Mail, and FOX News), the stories were produced within hours of the reporters reaching out to me. The stories were happening with or without my video. Like so many events in my life, this was all about being in the right place at the right time.

This blog topic is a departure from my usual discussion of SAS topics, so let's tie it back with a view of some YouTube stats. YouTube provides video analytics to any user with a YouTube channel, but the stats usually lag by several days. It's too soon to see the aggregated view of my stats that include the past two days. But, YouTube does offer a "real time" view of what is happening with your video right now. Here's my snapshot from this morning:
ytstats

If you watch the video in the next few days you'll be subjected to some advertising. That's how YouTube generates revenue from popular content. Thanks to my use of copyrighted music, I don't really have a chance to benefit financially from this sudden burst of activity. But that's okay with me -- I enjoy just watching the phenomenon to see how far it goes.


* FOX News reporters reached out to me yesterday and said the story would air yesterday afternoon. I haven't seen it, but my Dad confirmed they aired the video and gave him the photo credit.


** The idea of "turning off the Falls" sounds crazy to some, but really it's an impressive feat of engineering that was mastered decades ago. People are not generally aware that much of the "Falls" volume is diverted every day right now to provide hydroelectric power to the Northeast. Remember Y2K? When people were worried that the power grid might shut down when the year turned to 2000, one certainty remained: water would continue to flow over the Falls. The Niagara Falls hydro plant played a critical role in disaster preparations for Y2K. Of course, nothing came of it – Y2K was a big disappointment in that respect.

tags: social media

The post A viral video that was 47 years in the making appeared first on The SAS Dummy.

Sorting data in SAS: can you skip it?

$
0
0

TL;DR

The next time that you find yourself writing a PROC SORT step, verify that you're working with the SAS Base engine and not a database. If your data is in a database, skip the SORT!

The details: When to skip the PROC SORT step

Many SAS procedures allow you to group multiple analyses into a single step through use of the BY statement. The BY statement groups your data records by each unique combination of your BY variables (yes, you can have more than one), and performs the PROC's work on each distinct group.

When using the SAS Base data engine (that's your SAS7BDAT files), BY-group processing requires data records to be sorted, or at least pre-grouped, according to the values of the BY variables. The reason for this is that the Base engine accesses data records sequentially. When a SAS procedure is performing a grouped analysis, it expects to encounter all records of a group in a contiguous sequence. What happens when records are out of order? You might see an error like this:

ERROR: Data set SASHELP.CLASS is not sorted in ascending sequence. 
The current BY group has Sex = M and the next BY group has Sex = F.

I first described this in 2010: Getting out of SORTs with SAS data.

In a recent post, Rick Wicklin discussed a trick you can use to tell SAS that your data are already grouped, but the group values might not be in a sorted order. The NOTSORTED option lets you avoid a SORT step when you can promise that SAS won't encounter different BY group values interleaved across the data records.

Sorting data is expensive. In data tables that have lots of records, the sort processing requires tremendous amounts of temporary disk space for its I/O operations -- and I/O usually the slowest part of any data processing. But here's an important fact for SAS programmers: a SORT step is required only for SAS data sets that you access using the Base engine*. If your data resides in database, you do not need to sort or group your data in order to use BY group processing. And if you do sort the data first (as many SAS programmers do, out of habit), you're wasting time.

I'm going to use a little SAS trick to illustrate this in a program. Imagine two copies of the ubiquitous CLASS data set, one in a Base library and one in a database library. In my example I'll use the SPDE engine as the database, even though it's not a separate database server. (Yes! You can do this too! SPDE is part of Base SAS.)

/* these 3 lines will create a temp space for your */
/* SPDE data */
/* See: http://blogs.sas.com/content/sasdummy/use-dlcreatedir-to-create-folders/ */
options dlcreatedir;
libname t "%sysfunc(getoption(WORK))/spde"; 
libname t clear;
 
/* assign an SPDE library. Works like a database! */
libname spde SPDE "%sysfunc(getoption(WORK))/spde";
/* copy a table to the new library */
data spde.class;
 set sashelp.class;
run;
 
/* THIS step produces an error, because CLASS */
/* is not sorted by SEX */
proc reg data=sashelp.class;
    by sex;
    model age=weight;
run;
quit;
 
/* THIS step works correctly.  An implicit          */
/* ORDER BY clause is pushed to the database engine */
proc reg data=spde.class;
    by sex;
    model age=weight;
run;
quit;

Why does the second PROC REG step succeed? It's because the requirement for sorted/grouped records is passed through to the database using an implicit ORDER BY clause. You don't see it happening in your SAS log, but it's happening under the covers. Most SAS procedures are optimized to push these commands to the database. Most databases don't really have the concept of sorted data records; they return records in whatever sequence you request. Returning sorted data from a database doesn't have the same performance implications as a SAS-based PROC SORT step.

How does SAS Enterprise Guide generate optimized code?

Do you use SAS Enterprise Guide tasks to build your analyses? If so, you might have noticed that the built-in tasks go to great lengths to guarantee that your task will encounter data that are properly sorted. Consider this setup for the Linear Regression task:
Linear Regression task
Here's an example of the craziness that you'll see from the Linear Regression task when you have a BY variable in a Base SAS data set. There is "defensive" keep-and-sort code in there, because we want the task to work properly for any data scenario.

/* -------------------------------------------------------------------
   Determine the data set's type attribute (if one is defined)
   and prepare it for addition to the data set/view which is
   generated in the following step.
   ------------------------------------------------------------------- */
DATA _NULL_;
 dsid = OPEN("SASHELP.CLASS", "I");
 dstype = ATTRC(DSID, "TYPE");
 IF TRIM(dstype) = " " THEN
  DO;
  CALL SYMPUT("_EG_DSTYPE_", "");
  CALL SYMPUT("_DSTYPE_VARS_", "");
  END;
 ELSE
  DO;
  CALL SYMPUT("_EG_DSTYPE_", "(TYPE=""" || TRIM(dstype) || """)");
  IF VARNUM(dsid, "_NAME_") NE 0 AND VARNUM(dsid, "_TYPE_") NE 0 THEN
   CALL SYMPUT("_DSTYPE_VARS_", "_TYPE_ _NAME_");
  ELSE IF VARNUM(dsid, "_TYPE_") NE 0 THEN
   CALL SYMPUT("_DSTYPE_VARS_", "_TYPE_");
  ELSE IF VARNUM(dsid, "_NAME_") NE 0 THEN
   CALL SYMPUT("_DSTYPE_VARS_", "_NAME_");
  ELSE
   CALL SYMPUT("_DSTYPE_VARS_", "");
  END;
 rc = CLOSE(dsid);
 STOP;
RUN;
 
/* -------------------------------------------------------------------
   Sort data set SASHELP.CLASS
   ------------------------------------------------------------------- */
PROC SORT
 DATA=SASHELP.CLASS(KEEP=Age Height Sex &_DSTYPE_VARS_)
 OUT=WORK.SORTTempTableSorted &_EG_DSTYPE_
 ;
 BY Sex;
RUN;
TITLE;
TITLE1 "Linear Regression Results";
PROC REG DATA=WORK.SORTTempTableSorted
  PLOTS(ONLY)=ALL
 ;
 BY Sex;
 Linear_Regression_Model: MODEL Age = Height
  /  SELECTION=NONE
 ;
RUN;
QUIT;

This verbose code drives experienced SAS programmers crazy. But unlike a SAS programmer, the SAS Enterprise Guide code generator does not understand all of the nuances of your data, and thus can't guess what steps can be skipped. (SAS macro programmers: you know what I'm talking about. Think about all of the scenarios you have to code for/defend against in a generalized macro.)

And when running the same task with a database table? SAS Enterprise Guide detects the table source is a database, and builds a much more concise version:

TITLE;
TITLE1 "Linear Regression Results";
PROC REG DATA=SPDE.CLASS
  PLOTS(ONLY)=ALL
 ;
 BY Sex;
 Linear_Regression_Model: MODEL Age = Weight
  /  SELECTION=NONE
 ;
RUN;
QUIT;

Consider the data source, and -- if you can -- skip the SORT!

* The Base engine is not the only sequential data engine in SAS, but it's the most common.

tags: in-database, SAS Enterprise Guide, SAS programming

The post Sorting data in SAS: can you skip it? appeared first on The SAS Dummy.

Using PROC IOMOPERATE to list and stop your SAS sessions

$
0
0

If you're a SAS administrator, you probably know that you can use SAS Management Console to view active SAS processes. These are the SAS sessions that have been spawned by clients such as SAS Enterprise Guide or SAS Add-In for Microsoft Office, or those running SAS stored processes. But did you know that you can generate a list of these processes with SAS code? It's possible with the IOMOPERATE procedure.

To use PROC IOMOPERATE, you need to know the connection information for the SAS Object Spawner: the host, port, and credentials that are valid for connecting to the Object Spawner operator port. You plug this information into a URI scheme like the following:

iom://HOSTNAME:PORT;bridge;user=USERID,pass=PASSWORD

Here's an example:

iom://myserver.company.com:8581;bridge;user=sasadm@saspw,pass=Secret01

Are you squeamish about clear-text passwords? Good for you! You can also use PROC PWENCODE to obscure the password and replace its value in the URI, like this:

iom://myserver.company.com:8581;bridge;user=sasadm@saspw,
 pass={SAS002}BA7B9D0645FD56CB1E51982946B26573

Getting useful information from PROC IOMOPERATE is an iterative process. First, you use the LIST SPAWNED command to show all of the spawned SAS processes:

%let connection=
  'iom://myserver.company.com:8581;bridge;user=sasadm@saspw,pass=Secret01';
 
/* Get a list of processes */
proc iomoperate uri=&connection.;
    list spawned out=spawned;
quit;

Example output:
listspawned
You can retrieve more details about each process by running subsequent IOMOPERATE steps with the LIST ATTRS command. This can get tedious if you have a long list of spawned sessions. I've wrapped the whole shebang into a SAS program that discovers the processes and iterates through the list for you.

%let connection=
 'iom://myserver.company.com:8581;bridge;user=sasadm@saspw,pass=Secret01';
 
/* Get a list of processes */
proc iomoperate uri=&connection.;
    list spawned out=spawned;
quit;
 
/* Use DOSUBL to submit a PROC IOMOPERATE step for   */
/* each SAS process to get details                   */
/* Then use PROC TRANSPOSE to get a row-wise version */
data _null_;
    set spawned;
    /* number each output data set */
    /* for easier appending later  */
    /* TPIDS001, TPIDS002, etc.    */
    length y $ 3;
    y = put(_n_,z3.);
    x = dosubl("
    proc iomoperate uri=&connection. launched='" || serverid || "';
    list attrs cat='Information' out=pids" || y || ";
    quit;
    data pids" || y || ";
    set pids" || y || ";
    length sname $30;
    sname = substr(name,find(name,'.')+1);
    run;
 
    proc transpose data=work.pids" || y || "
    out=work.tpids" || y || "
    ;
    id sname;
    var value;
    run;
    ");
run;
 
/* Append all transposed details together */
data allpids;
    set tpids:;
    /* calculate a legit datetime value */
    length StartTime 8;
    format StartTime datetime20.;
    starttime = input(UpTime,anydtdtm19.);
run;
 
/* Clean up */
proc datasets lib=work nolist;
delete tpids:;
delete spawned;
quit;

The output details include "up time" (when the process was launched), the process ID (a.k.a. PID), the owner account, the SAS version, and more. Here's a snippet of some example output:
detailspawn

You can use this information to stop a process, if you want. That's right: from a SAS program, you can end any (or all) of the spawned SAS processes within your SAS environment. That's a handy addition to the SAS administrator toolbox, though it should be used carefully! If you stop a process that's in active use, an unsuspecting SAS Enterprise Guide user might lose work. And he won't thank you for that!

To end (kill) a SAS process, you need to reference it by its unique identifier. In this case, that's not the PID -- it's the UUID that the LIST ATTRS command provided. Here's an example of the STOP command:

/* To STOP a process */
    proc iomoperate uri=&connection.;                                  
        STOP spawned server 
             id="03401A2E-F686-43A4-8872-F3438D272973"; 
    quit;                                                             
/* ID = value is the UniqueIdentifier (UUID)      */
/*  Not the process ID (PID)                      */

It seemed to me that this entire process could be made easier with a SAS Enterprise Guide custom task, so I've built one! I'll share the details of that within my next blog post.

tags: IOMOPERATE, sas administration

The post Using PROC IOMOPERATE to list and stop your SAS sessions appeared first on The SAS Dummy.

SAS knows it's a leap year. Do you?

$
0
0

Leap year questions come up all of the time in computing, but if there is any true season for it, it's now. The end of February is approaching and developers wonder: does my process know that it's a leap year, and will it behave properly?

People often ask how to use SAS to calculate the leap years. The complicated answer is:

  • Check whether the year is divisible by 4 (MOD function)
  • But add exceptions when divisible by 100 or 400
  • Yeah...except when it's also divisible by 1000.

The simple answer is: ask SAS. You can create a SAS date value with the MDY function. Feb 29 is a valid date for leap years; in off years, MDY returns a missing value.

data leap_years(keep=year);
  length date 8;
  do year=2000 to 2200;
    /* MISSING when Feb 29 not a valid date */
    date=mdy(2,29,year);
    if not missing(date) then
      output;
  end;
run;

Here's an excerpt of the result:

2000
...
2080
2084
2088
2092
2096
2104
2108
2112
2116

Notice how 2000 was included, but 2100 is not? That's not a leap year, and SAS knows it. Did you?

See also

In the year 9999...: history of leap year and some software bugs

tags: leap year, SAS programming

The post SAS knows it's a leap year. Do you? appeared first on The SAS Dummy.


A custom task to list and stop your SAS sessions

$
0
0

Last week I described how to use PROC IOMOPERATE to list the active SAS sessions that have been spawned in your SAS environment. I promised that I would share a custom task that simplifies the technique. Today I'm sharing that task with you.

How to get the SAS Spawned Processes task

You can download the task from this SAS communities topic, where I included it as an attachment. The instructions for installation are standard for any custom task; the details are included in the README file that is part of the task package.

You can also view and pull the source code for the task from my GitHub repository. I built it using Microsoft .NET and C#.

How to use the SAS Spawned Processes task

Once you have the task installed, you can access it from the Tools->Add-In menu in SAS Enterprise Guide. (By the way, the task should also work in the SAS Add-In for Microsoft Office -- though the installation instructions are a little different.)

The task works by using PROC IOMOPERATE to connect to the SAS Object Spawner. You'll need to provide the connection information (host and port) plus the user/password for an account that has the appropriate permissions (usually a SAS admin account). Note that the port value is that of the Object Spawner operator port (by default, 8581) and not the SAS Metadata Server.

spawnedprocesses
The task shows a list of active SAS processes. Of course, you're using a SAS process to even run the task, so your active process is shown with a yellow highlight. You can select any of the processes in the list and select End Process to stop it. You can drill into more detail for any selected process with the Show Details button. Here's an example of more process details:

processprops
Did you try the task? How did it work for you? Let me know here or in the SAS communities.

Custom task features within this example

If you're professionally interested in how to build custom tasks, this example shows several techniques that implement common requirements. Use the source code as a reference to review how these are built (and of course you can always refer to my custom tasks book for more guidance).

  • Submit a SAS program in the background with the SasSubmitter class. There are two examples of this in the task. The first example is an asynchronous submit to get the list of processes, where control returns to the UI and you have the option to cancel if it takes too long. With an asynch submit, there are some slightly tricky threading maneuvers you need to complete to show the results in the task. The second example uses a synchronous submit (SubmitSasProgramAndWait) to stop a selected SAS process.
  • Read a SAS data set. The SAS program that retrieves a list of processes places that result in a SAS data set. This task uses the SAS OLE DB provider to open the data set and read the fields in each row, so it can populate the list view within the task.
  • Detect errors and show the SAS log. If the SAS programs used by the task generate any errors (for example, if you supply the wrong credentials), the task uses a simple control (SAS.Tasks.Toolkit.Controls.SASLogViewDialog) to show the SAS log -- color-coded so the error is easy to spot.
  • Retrieve the value of a SAS macro variable by using SasServer.GetSasMacroValue("SYSJOBID"). This pulls the process ID for your active SAS session, so I can compare it to those retrieved by PROC IOMOPERATE. That's how I know which list item to highlight in yellow.
  • Save and restore settings between uses. Entering credentials is a drag, so the task uses a helper class (SAS.Tasks.Toolkit.Helpers.TaskUserSettings) to save your host/port/user information to a local file in your Windows profile. When you use the task again, the saved values are placed into the fields for you. I don't save the password -- I'm sure that I'd get complaints if I did that, even if I encoded it.
tags: IOMOPERATE, sas administration, SAS custom tasks, SAS Enterprise Guide

The post A custom task to list and stop your SAS sessions appeared first on The SAS Dummy.

The zoomiest new feature in SAS Enterprise Guide 7.12

$
0
0

Have you ever been in a meeting in which a presenter is showing content on a web page -- but the audience can't read it because it's too small? Then a guy sitting in the back of the room yells, "Control plus!". Because, as we all know (right?), "Ctrl+" is the universal key combination that zooms your browser content.

I'm that guy -- the one who is shouting out the key combos. Every time. And as it turns out, we don't all know about this handy way to magnify the content on a page. And do you know what else works? Holding down the Ctrl key while sliding the mouse wheel. That trick also works in Microsoft Office products like Excel, Word, and even Outlook e-mail.

With the latest release of SAS Enterprise Guide (version 7.12, released last week), you can now use these ubiquitous magical key combinations to zoom your SAS content: HTML results, process flow, data records, and even your SAS program code. In my lucky position as a SAS insider, I've been using this for months and it's absolutely my favorite new thing. Have a lengthy SAS program? Here's a fun thing to do: zoom the program editor out to 10% to see the shape of your code. Then compare with those from your friends and draw sweeping conclusions about each other, Rorschach-test style.

Here's a few animated screen shots from SAS Enterprise Guide 7.12, showing the Ctrl+ zooming in action for code, data, and HTML results:

Zoom in close to your code

Zoom in close to your code

See your data near and far

See your data near and far

Close in on your HTML results

Close in on your HTML results

tags: SAS Enterprise Guide

The post The zoomiest new feature in SAS Enterprise Guide 7.12 appeared first on The SAS Dummy.

Add files to a ZIP archive with FILENAME ZIP

$
0
0

In previous articles, I've shared tips about how you can work with SAS and ZIP files without requiring an external tool like WinZip, gzip, or 7-Zip. I've covered:

But a customer approached me the other day with one scenario I missed: how to add SAS data sets to an existing ZIP file. It's a variation of a tip that I've already shared, but with two differences. First, in order to add a data set to a ZIP file, you have to know its physical filename -- not just the LIBNAME.MEMBER reference that you use in SAS procedure steps. And second, I had not shown how to add a new file to an existing ZIP archive -- though it turns out that's pretty simple.

Find the file name for a SAS data set

There are several ways to do this. For my approach, I used the output from PROC CONTENTS. Notice that I had to capture the ODS output (not the OUT= data set) to grab the file name. I wrapped it in a macro for easy reuse. And since I ultimately need a SAS fileref to map to the path, I've assigned one (data_fn) in my macro.

/* macro to assign a fileref to a SAS data set in a Base library */
%macro assignFilerefToDataset(_dataset_name);
    %local outDsName;
    ods output EngineHost=File;
    proc contents data=&_dataset_name.;
    run;
    proc sql noprint;
        select cValue1 into: outDsName 
            from work.file where Label1="Filename";
    quit;
    filename data_fn "&outDsName.";
%mend;

How to add a new member to a ZIP file

Now that I have the source file, I need to designate a destination file in a ZIP archive. The FILENAME ZIP method will create a new ZIP file if one does not yet exist, or it can add to an existing ZIP. To ensure I'm starting from scratch, I assign a simple fileref to my target destination and then delete the file.

/* Assign the fileref - basic file method */
filename projzip "&projectDir./project.zip";
/* Start with a clean slate - delete ZIP if it exists */
data _null_;
    rc=fdelete('projzip');
run;

To create a new ZIP file and designate a path and file name within it, I used the FILENAME ZIP method with the MEMBER= option. Note that I specified the "data/" subfolder in the MEMBER= value; this will place the file into a named subfolder within the archive.

/* Use FILENAME ZIP to add a new member -- CLASS */
/* Put it in the data subfolder */
filename addfile zip "&projectDir./project.zip" 
    member='data/class.sas7bdat';

Then finally, I need to actually "copy" the file into the archive. I do this by streaming the source file into the target fileref byte-by-byte:

/* byte-by-byte copy */
/* "copies" the new file into the ZIP archive */
data _null_;
    infile data_fn recfm=n;
    file addfile recfm=n;
    input byte $char1. @;
    put  byte $char1. @;
run;
 
filename addfile clear;

That's it! I now have a ZIP file with one member entry. Now I can "press repeat" to add a second entry:

%assignFilerefToDataset(sashelp.cars);
/* Use FILENAME ZIP to add a new member -- CARS */
/* Put it in the data subfolder */
filename addfile zip "&projectDir./project.zip" 
    member='data/cars.sas7bdat';
/* byte-by-byte copy */
/* "copies" the new file into the ZIP archive */
data _null_;
    infile data_fn recfm=n;
    file addfile recfm=n;
    input byte $char1. @;
    put  byte $char1. @;
run;
 
filename addfile clear;

Optional: Report on the ZIP file contents

If I want to report on the total contents of the ZIP file now, here's a DATA step and PROC CONTENTS step that does the job:

/* OPTIONAL for reporting */
/* Report on the contents of the ZIP file */
/* Assign a fileref wth the ZIP method */
filename inzip zip "&projectDir./project.zip";
/* Read the "members" (files) from the ZIP file */
data contents(keep=memname);
    length memname $200;
    fid=dopen("inzip");
    if fid=0 then
        stop;
    memcount=dnum(fid);
    do i=1 to memcount;
        memname=dread(fid,i);
        output;
    end;
    rc=dclose(fid);
run;
/* create a report of the ZIP contents */
title "Files in the ZIP file";
proc print data=contents noobs N;
run;

Result:

Files in the ZIP file 

memname
---------------------
data/class.sas7bdat
data/cars.sas7bdat 
N = 2

I hope that this helps to make the FILENAME ZIP method more useful to those who want to try it out. I'm sure that there will be more scenarios that people will ask about; someday, if I write enough blog posts, I'll have it all covered!

Sample program: You can view/download the entire SAS program (containing the snippets I've featured and more) from my GitHub profile.

tags: FILENAME ZIP, SAS 9.4, SAS programming, ZIP files

The post Add files to a ZIP archive with FILENAME ZIP appeared first on The SAS Dummy.

And it's Boaty McBoatface by an order of magnitude

$
0
0

In a voting contest, is it possible for a huge population to get behind a ridiculous candidate with such force that no other contestant can possibly catch up? The answer is: Yes.

Just ask the folks at NERC, the environmental research organization in the UK. They are commissioning a new vessel for polar research, and they decided to crowdsource the naming process. Anyone in the world is welcome to visit their NameOurShip web site and suggest a name or vote on an existing name submission.

As of today, the leading name is "RRS Boaty McBoatface." ("RRS" is standard prefix for a Royal Research Ship.) This wonderfully creative name is winning the race by more than just a little bit: it has 10 times the number of votes as the next highest vote getter, "RRS Henry Worsley".

I wondered whether the raw data for this poll might be available, and I was pleased to find it embedded in the web page that shows the current entries. The raw data is in JSON format, embedded in the source of the HTML page. I saved the web page source to my local machine, copied out just the JSON line with the submissions data, then used SAS to parse the results. Here's my code:

filename records "c:\projects\votedata.txt";
 
data votes (keep=title likes);
 length likes 8;
 format likes comma20.;
 label likes="Votes";
 length len 8;
 infile records;
  if _n_ = 1 then
    do;
      retain likes_regex title_regex;
      likes_regex = prxparse("/\'likes\'\:\s?([0-9]*)/");
      title_regex = prxparse("/\'title\':\s?\""([a-zA-Z0-9\'\s]+)/");
    end;
 input;
 
 position = prxmatch(likes_regex,_infile_);
  if (position ^= 0) then
    do;
      call prxposn(likes_regex, 1, start, len);
      likes = substr(_infile_,start,len);
    end;
 start=0; len=0;
 
 position = prxmatch(title_regex,_infile_);
  if (position ^= 0) then
    do;
      call prxposn(title_regex, 1, start, len);
      title = substr(_infile_,start,len);
    end;
run;

With the data in SAS, I used PROC FREQ to show the current tally:

title "Vote tally for NERC's Name Our Ship campaign";
proc freq data=votes order=freq;
table title;
weight likes;
run;

boatfacefreq
The numbers are compelling: good ol' Boaty Mac has over 42% of the nearly 200,000 votes. The arguably more-respectable "Henry Worsley" entry is tracking at just 4%. I'm not an expert on polling and sample sizes, but even I can tell that Boaty McBoatface is going to be tough to beat.

To drive the point home a bit more, let's look at a box plot of the votes distribution.

title "Distribution of votes for ALL submissions";
proc sgplot data=votes;
hbox likes;
xaxis valueattrs=(size=12pt);
run;

In this output, we have a clear outlier:
hboxall
If we exclude Boaty, then it shows a slightly closer race among the other runners up (which include some good serious entries, plus some whimsical entries, such as "Boatimus Prime"):

title "Distribution of votes for ALL submissions except Boaty McBoatface";
proc sgplot data=votes(where=(title^="Boaty McBoatface"));
hbox likes;
xaxis valueattrs=(size=12pt);
run;

hboxtrim
See the difference between the automatic axis values between the two graphs? The tick marks show 80,000 vs. 8,000 as the top values.

Digging further, I wondered whether there were some recurring themes in the entries. I decided to calculate word frequencies using a technique I found on our SAS Support Communities (thanks to Cynthia Zender for sharing):

/* Tally the words across all submissions */
data wdcount(keep=word);
    set votes;
    i = 1;
    origword = scan(title,i);
    word = compress(lowcase(origword),'?');
    wordord = i;
    do until (origword = ' ');
        /* exclude the most common words */
        if word not in ('a','the','of','and') then output;
        i + 1;
        wordord = i;
        origword = scan(title,i);
        word = compress(lowcase(origword),'?');
    end;
run;
 
proc sql;
   create table work.wordcounts as 
   select t1.word, 
          /* count_of_word */
            (count(t1.word)) as word_count
      from work.wdcount t1
      group by t1.word
      order by word_count desc;
quit;
title "Frequently occurring words in boat name submissions";
proc print data=wordcounts(obs=25);
run;

The top words evoke the northern, cold nature of the boat's mission. Here are the top 25 words and their counts:

  1    polar         352 
  2    ice           193 
  3    explorer      110 
  4    arctic         86 
  5    red            69 
  6    sir            55 
  7    john           54 
  8    lady           46 
  9    sea            42 
 10    ocean          42 
 11    scott          41 
 12    bear           39 
 13    aurora         38 
 14    artic          37 
 15    queen          37 
 16    captain        36 
 17    james          36 
 18    endeavour      35 
 19    william        35 
 20    star           34 
 21    spirit         34 
 22    new            26 
 23    antarctic      26 
 24    boat           25 
 25    cold           25

I don't know when voting closes, so maybe whimsy will yet be outvoted by a more serious entry. Or maybe NERC will exercise their right to "take this under advisement" and set a certain standard for the finalist names. Whatever the outcome, I'm sure we haven't heard the last of Boaty...

tags: Boaty McBoatface, regular expressions, SAS programming, SGPLOT

The post And it's Boaty McBoatface by an order of magnitude appeared first on The SAS Dummy.

Boaty McBoatface is on the run

$
0
0

I know what you're thinking: two "Boaty McBoatface" articles within two weeks? And we're past April Fool's Day?

But since I posted my original analysis about the "Name our ship" phenomenon that's happening in the UK right now, a new contender has appeared: Poppy-Mai.

The cause of Poppy-Mai, a critically ill infant who has captured the imagination of many British citizens (and indeed, of the world), has made a very large dent in the lead that Boaty McBoatface holds.

poppymai
Yes, "Boaty" still has a-better-than 4:1 lead. But that's a lot closer than the 10:1 lead (over "Henry Worsley") from just over a week ago. Check out the box plot now: you can actually make out a few more dots. Voting is open for another 10 days -- and as we have seen, a lot can happen in that time.

poppybox
As I take this second look at the submissions (now almost 6300) and voting data (almost 350,000 votes cast), I've found a few more entries that made me chuckle. Some of them struck me by their word play, and others cater to my nerdy sensibilities. Here they are (capitalization retained):

While I'm on this topic, I want to give a shout-out to regex101, the online regular expression tester. I was able to develop and test my regular expressions before dropping them into a PRXPARSE function call. I found that I had to adjust my regular expression to cast a wider net for valid titles from the names submissions data. Previously, I wasn't capturing all of the punctuation. While that's probably because I didn't expect punctuation to be part of a ship's name, that assumption doesn't stop people from suggesting and voting on such names. My new regex match:

  title_regex = prxparse("/\'title\':\s?\""([a-zA-Z0-9\'\.\-\_\#\s\$\%\&\(\)\@\!]+)/");

I could probably optimize by specifying an exception pattern instead of an inclusion pattern...but this isn't the sort of project where I worry about that.

Will I write about Boaty McBoatface again? What will my next Boaty article reveal? Stay tuned!

tags: Boaty McBoatface, regular expressions, SAS programming, SGPLOT

The post Boaty McBoatface is on the run appeared first on The SAS Dummy.

Viewing all 234 articles
Browse latest View live