PURPOSE: To act as a base for discussion and development of a Matlab - Geodise/Grid interface. VERSION CONTROL: CURRENT - the latest version can be obtained by email to m.molinari@soton.ac.uk UPDATE - please send updates in text format by email to MM. They will be merged with the current version using the CVS system until we move to a new infrastructure. 0.1 Created by Marc 05/07/2002 -> based on discussion at Geodise meeting 05/07/02 0.1.1 08/07/02 Filled in some gaps. Added Appendix with sample matlab file. Created sections 'Failure and Error Handling', and 'Monitoring jobs'. Renamed grid_move into grid_transfer. MATLAB TO GEODISE INTERFACE =========================== -- INTRODUCTION -- The Geodise Toolbox will enable design optimisation by offering easy access to knowledge bases which guide the design process, to databases consisting of designs and object models, and to distributed computing resources. An important question is how to present the Geodise interface to the end user, the engineer. As many scientists/engineers use the platform- independent Matlab software package (http://www.mathworks.com) for prototyping or development, it is desirable to provide an interface to Geodise which can easily be used by engineers familiar with Matlab. -- CONTENTS -- This document should be used and edited as a collection of thoughts, requirements, identified issues, etc. related to such an implementation of a Matlab interface to Geodise. Please feel free to add and discuss any relevant issues directly within the text - this is a 'living' document and has no limits on the content, so please make your contributions however big/small they might be. We all can only benefit from sharing this information and can access it whenever necessary. -- EXISTING PROJECTS -- Some existing projects which might be of relevance are: NetSolve - http://icl.cs.utk.edu/netsolve/ RPC based client/agent/server system based on standard Internet protocols (TCP/IP sockets). Implements Network Weather Service (NWS), Numerical Libraries (LAPACK, BLAS, PETSc, AZTEC), client-server authentication via Kerberos, GUI Problem Description File generator, supports Matlab, Fortran and C, allows to use Condor interface and Globus (in testing phase). NEOS / metaNEOS / iNEOS - http://www-neos.mcs.anl.gov/neos/ http://www-unix.mcs.anl.gov/metaneos/ NEOS is a Metacomputing Environment for Optimization. It is a project with no direct relation to Matlab. Some characeristics: Web interface for job submission and control, communication based on CORBA, status information provided in XML, usage of Condor pool. Focus on the dynamic computing environment (heterogeneity, failures, reliability, large scale, etc.) . From their website: "The metaNEOS project integrates fundamental algorithmic research in optimization with research and infrastructure tool development in distributed systems management. Algorithms that can exploit the powerful but heterogeneous, high-latency and possibly failure-prone virtual hardware platform typical of metacomputing platforms have been developed in such areas as global optimization, integer linear optimization, integer nonlinear optimization, combinatorial optimization, and stochastic optimization." Ninf - http://ninf.apgrid.org Project very similar to NetSolve. Based on Ninf RPCs. Client/ metaserver/server structure. Metaserver is an agent implemented in Java, delegates calls from client to servers. -- IMPLEMENTATION -- The three main issues to address in terms of implementation are: 1) Development Environment 2) Knowledge Base Access 3) Submission and Execution of Jobs As for the development environment, we know that Matlab has two modes of operation: Either a written script can be run by the Matlab interpreter or functions can be executed directly on a command line version of Matlab. The provision of an additional new problem solving environment is certainly beyond the availability of current resources. It might be useful to look into the use and extension of existing and widely available PSE (for example the graphical package SCIRun) [...] Many users of Matlab write these scripts either with the Matlab editor or with other text editors, such as emacs, wordpad, etc. To provide a knowledge base driven interactive design process, a separate designer tool with link to the engineering design knowledge base is preferred. This avoids having to link the knowledge base API to proprietary editors. Another suggestion was to base this knowledge base interface on webpages. [...] The execution of jobs requires an interface and infrastructure which transfers commands and data to the execution site(s) and ensures correct dealing with the program (interpretation, compilation, etc.). [...] -- COMMAND CLASSIFICATION -- There exist 3 different types of Matlab commands. These can be classified and described as: (I) Interactive commands Commands which immediately return information. (S) Scripting commands Commands which can be part of a matlab script, for example function calls. (O) Offline commands Commands such as grid_reserve. -> More clarification is needed as to the meaning of 'offline'. -- IMPORTANT MATLAB COMMANDS -- Based on the discussions at the 05/07/02 meeting, should the knowledge base be separated from the Matlab environment so that it can be used in combination with any editor and is operating system independent. The main issues to consider on the user side for interfacing to and accessing knowledge support are + Finding a method + Usage of this method + Show/explore a sample script Local(?) services that need to be provided in this context are + Ontology service + Knowledge base service + Annotation service Identified as the most important commands for a Matlab interface to Grid/Geodise are: + grid_init SYNTAX: [RESULT] = grid_init(???) Initialises the interface to Geodise/Grid. Checks if Grid connection established & working. Deals with security issues (retrieval of certificate, etc.). Provide user contact details (email). + grid_reserve SYNTAX: [RESULT] = grid_reserve(NNODES, NODETYPE, DATE, LENGTH OF TIME) Reserves a specified number of resources for a specific time window on the Grid. + grid_status SYNTAX: [RESULT] = grid_status([SPECIFICSERVER]) Shows the status of current Grid connections. Lists available servers, databases, etc. Maybe indicates the length of the job queue. + grid_status(job) SYNTAX: [RESULT] = grid_status(JOBID) Returns information about specific submitted job(s), eg. 'queued', 'running', 'finished', etc. + grid_run SYNTAX: [RESULT, STATUS, JOBID] = grid_run('command' [, PARAMETERLIST]) Submits a command to the Grid and executes it. + blocking Waits for the results to come back before execution of script is resumed. + non-blocking Returns immediately after submission. Carries on executing script. Results need to be polled when available. -> Is there a function for polling??? (mm) + grid_submit SYNTAX: [RESULT, STATUS, JOBID] = grid_submit('script' [,PARAMETERLIST]) Submits a whole matlab script for execution to Grid/Geodise. -> Not sure if we really need this as it might make things very complicated (need parser, compiler/licenses, etc.) + grid_load SYNTAX: [RESULT, DATA] = grid_load('from-location') Loads data from a given file or location or database. + pass by value The data is returned as such. + pass by reference Only a pointer/reference/url is returned. -> maybe a grid_load(from, to) would be useful to transfer data to a different location, see also grid_move. + grid_save SYNTAX: [RESULT] = grid_save('to-location', DATA/'from-location') Saves data in a given file or location or database. -> Q: What datastructure is used for this? (mm) + pass by value The data is passed directly to the output stream/file. + pass by reference The data is referenced by pointer/url/link. + grid_query SYNTAX: [DATA] = grid_query(DATABASE, QUERYSTRING) Queries a database about data format, contents, etc. + **new** grid_transfer SYNTAX: [RESULT] = grid_transfer('from-location', 'to-location') Moves data from some location to the server where job is executed and keeps it there for repetitive computations. -> Don't know how relevant this is. Needs discussing. (mm) + grid_locate SYNTAX: [DATA] = grid_locate(DATABASE(S), QUERYSTRING) Locates files, data, metadata in databases and returns a list of files satisfying a user query. + grid_pause SYNTAX: [STATUS] = grid_pause([JOBID]) Pauses a submitted job. -> Not quite clear about the purpose... (mm) -> Maybe a job which needs to wait for another one to finish first? (mm) -> Does this cause problems with scheduling/queueing/time reservation? (mm) + grid_resume SYNTAX: [STATUS] = grid_resume([JOBID]) Resumes a paused job or all paused jobs. See also: grid_pause + grid_stop / kill / remove SYNTAX: [STATUS] = grid_stop(JOBID) Stops or kills or removes a currently running job. This means stopping it and/or removing it from the job queue. -- FAILURE & ERROR HANDLING -- All of the commands described in 'IMPORTANT MATLAB COMMANDS' can potentially fail due to a number of reasons. These include + unavailability of network connection + no Grid ressources can be located + no Grid resources available at the moment + database is non-existent + database is down + pointers to data are in wrong format (local/global/url) + no access rights for reading/writing/execution + no security certificate on local computer + local (Matlab) process finishes without polling results + remote process can fail (remote site errors) + remote process can fail (scripting errors, eg 'division by 0') + command parameters are formulated incorrectly + JOBID has disappeared from the queue + ... -> Q: How are these failures handled at present in the Geodise context? (mm) These have to be taken into account when implementing the commands. The error cause and a possible solution (knowledge-based!) have to be communicated to the user. This can either be done by information being displayed directly in the Matlab environment for interactive(I) commands or by the user requesting status information as output from a function call, e.g. [RESULT, STATUS, JOBID] = grid_run('command' [, PARAMETERLIST]) where RESULT the contains the results of the command, STATUS returns an error indicator if the command cannot be executed, and JOBID is the job identifier for a running job. An alternative would be to pop up a helper/status window containing a job list and/or error messages related to Grid/Geodise. -> This needs certainly more discussion. (mm) -- MONITORING JOBS -- (fenglian) Job states should be monitored continuously so that the user knows how much time is left for the jobs to finish. -> Q: Where does this information come from? -> Q: How can it be realised/implemented? -> Q: What services are provided where? -> Are these implementations linked more to Geodise or do these need to be implemented in Matlab window/routines? -- APPENDIX A: SAMPLE MATLAB SCRIPT -- This is a matlab script which computes in a rather inefficient way a best-fit curve of the form f(x) = ax^2+bx+c to a number of measurent points (number given by parameter n). =========== CODE OF FILE optsample.m STARTS HERE ============= function [Result] = optsample(n) % OPTSAMPLE is a sample script which computes % the parameters necessary for fitting a quadratic % function f = ax^2+bx+c to a number of measurements % in a least square sense. % % INPUT: % n, number of measurements, optional, n>3 % % OUTPUT: % Result, optional argument returning the 3 parameters [a,b,c] % % written by Marc Molinari % This line will not be displayed when typing 'help optsample' % in Matlab since there is a space in the previous line. % check if n is given and valid if nargin>0 % parameter given if n<3 error('Parameter n is too small, must be >= 3'); end else n=10; end % load in measurement data: % M = load('/home/mydata/optsample.mat') % or simulate measurement: x = [0:10/n:10-1e-6]'; y = (x-5).^2 + 1*x + 1; y = y + 5*rand(n,1); % add measurement noise M = [x y]; % display measurement data: f = gcf; plot(M(:,1),M(:,2),'or'); % we start by fitting the curve to the first three points p = [1:3]'; for t=3:n % select first t measurements for curve fitting p = [1:t]'; % call function that fits the curve and returns the parameters R = fitsquare(M(p,:)); % plot graph on same figure: plot(M(:,1),M(:,2),'or'); hold on; f = R(1)*x(1:t).^2 + R(2)*x(1:t) + R(3); plot(x(1:t), f); hold off end % check if return parameter present if nargout>0 Result = R; end return %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function [R] = fitsquare(M) % This function is a local function, i.e. not visible from % outside the function optsample. % Fitsquare() takes measurements M and returns fitting % parameters in vector R for a quadratic fitting function. % do some 'computationally intensive' calculations x = M(:,1); y = M(:,2); a1 = size(x,1); a2 = sum(x); a3 = sum(x.^2); a4 = sum(x.^3); a5 = sum(x.^4); b1 = sum(y); b2 = sum(y.*x); b3 = sum(y.*(x.^2)); R = [a5, a4, a3; a4, a3, a2; a3, a2, a1] \ [b3; b2; b1]; % simulate more computational time... pause(1); return =========== CODE OF FILE optsample.m FINISHES HERE =============