2020-10-11

10: Dynamically Importing Any Python Module from Any Source

<The previous article in this series | The table of contents of this series | The next article in this series>

The source may not be any OS file, but a HTTP, FTP, etc. resource, a database item, a program, or whatever, possibly for dynamic module contents.

Topics


About: the Python programming language

The table of contents of this article


Starting Context


  • The reader has a basic knowledge on the Python programming language.

Target Context


  • The reader will know how to dynamically import any possibly dynamic module from any source, which may not be any operating system file, but a HTTP, FTP, etc. resource, a database item, a program, or whatever.

Orientation


The code introduced here includes mypy annotations; if they are unfamiliar, a previous article will be a sufficient explanation, or they can be safely ignored.


Main Body


1: What If a Python Module File Is Not of Any Supposed Path?


Hypothesizer 7
Usually, the Python module file is in a supposed operating system directory, with the supposed file name.

For example, a module, 'theBiasPlanet.coreUtilities.pipes.StringPipe', is in a file, 'theBiasPlanet/coreUtilities/pipes/StringPipe.py', where the 'theBiasPlanet' directory is in a directory registered in 'PYTHONPATH'.

What if the module is in a file named 'StringPipe.txt' or whatever and/or in a different directory structure?


2: What If a Python Module Is Not Even in Any Operating System File?


Hypothesizer 7
What if a Python module is not even in any operating system file?

What does that mean?

Well, it may be in a HTTP, FTP, etc. resource, a database item, a program, or whatever.

The module may be even dynamic, generated by a certain algorithm.


3: An Obvious Option


Hypothesizer 7
An obvious option is to save the module contents in the file of a supposed path.

In other words, the module contents are cached in the file.

Why not?

Well, some people may have some reasons to object, the others not.

I personally do not deny the option, but I wonder how exactly I should implement it: when and how the caching should be invoked; how the to-be-cached modules should be selected (I do not want to cache unnecessary modules); when and how the expired caches should be removed, etc. . . . Not impossible, but not so painless, it seems.


4: A Possibly Better Option


Hypothesizer 7
A possibly better way is to use the 'importlib' package.

Its advantages are that only necessary modules are fetched as required and no remains are left behind, eliminating my above-mentioned concerns.


4-1: The Mechanism


Hypothesizer 7
A source loader is used, which receives a module name and returns the contents bytes array.

This is the way how it works, where 'HttpPythonSourceLoader' is the source loader class (created by me) and 'theBiasPlanet_coreUtilities_pipes_StringPipe' is the module object (the module name is 'theBiasPlanet.coreUtilities.pipes.StringPipe' and the module exists in 'http://localhost:8080/pythonSource/theBiasPlanet/coreUtilities/pipes/StringPipe.py').

@Python Source Code
		l_httpPythonSourceLoader: HttpPythonSourceLoader = HttpPythonSourceLoader ("http://localhost:8080/pythonSource/")
		l_pythonModuleName = "theBiasPlanet.coreUtilities.pipes.StringPipe"
		theBiasPlanet_coreUtilities_pipes_StringPipe = ModuleType (l_pythonModuleName)
		l_httpPythonSourceLoader.exec_module (theBiasPlanet_coreUtilities_pipes_StringPipe)
		sys.modules [l_pythonModuleName] = theBiasPlanet_coreUtilities_pipes_StringPipe

Do you want to access the module as 'theBiasPlanet.coreUtilities.pipes.StringPipe', not as 'theBiasPlanet_coreUtilities_pipes_StringPipe'?

Well, the above code itself does not automatically create the 'theBiasPlanet' package object, etc., so, you will have to specifically create the package objects, if you really need that (I do not).

If you want to do like 'from theBiasPlanet.coreUtilities.pipes.StringPipe import StringPipe' (the 'StringPipe' class is inside the module), this will do, after the above code.

@Python Source Code
		StringPipe = theBiasPlanet_coreUtilities_pipes_StringPipe.StringPipe


4-2: Creating the Source Loader Class


Hypothesizer 7
So, the issue is how to create the source loader class.

In fact, this is the source loader class.

@Python Source Code
from typing import Union
from typing import cast
from http.client import HTTPConnection
from http.client import HTTPResponse
from http.client import HTTPSConnection
from importlib.abc import SourceLoader
import urllib.parse
from urllib.parse import ParseResult

class HttpPythonSourceLoader (SourceLoader):
	c_readingBlockSize: int = 1024
	
	def __init__ (a_this: "HttpPythonSourceLoader ", a_urlPrefix: str) -> None:
		a_this.i_urlPrefix: str = a_urlPrefix
		a_this.i_httpConnection: HTTPConnection = None
		
		l_parsedUrl: ParseResult = urllib.parse.urlparse (a_this.i_urlPrefix)
		if l_parsedUrl.scheme == "https":
			a_this.i_httpConnection = HTTPSConnection (l_parsedUrl.hostname, l_parsedUrl.port)
		else:
			a_this.i_httpConnection = HTTPConnection (l_parsedUrl.hostname, l_parsedUrl.port)
	
	def get_filename (a_this: "HttpPythonSourceLoader", a_pythonModuleName: str) -> str:
		return "{0:s}{1:s}.{2:s}".format (a_this.i_urlPrefix, a_pythonModuleName.replace (".", "/"), "py")
	
	def get_data (a_this: "HttpPythonSourceLoader", a_pythonModuleUrl: Union [bytes, str]) -> bytes:
		l_httpResponseBody: bytes = b""
		try:
			l_parsedUrl: ParseResult = urllib.parse.urlparse (cast (str, a_pythonModuleUrl))
			a_this.i_httpConnection.request ("GET", l_parsedUrl.path)
			l_httpResponse: HTTPResponse = a_this.i_httpConnection.getresponse ()
			if l_httpResponse.status == 200:
				l_httpResponseBodyBuffer: bytes = None
				while True:
					l_httpResponseBodyBuffer = l_httpResponse.read (HttpPythonSourceLoader.c_readingBlockSize)
					if len (l_httpResponseBodyBuffer) == 0:
						break
					l_httpResponseBody = l_httpResponseBody + l_httpResponseBodyBuffer
			else:
				raise Exception ("The Python module source could not be accessed.")
		finally:
			a_this.i_httpConnection.close ()
			
		return l_httpResponseBody

That is an implementation of the abstract 'importlib.abc.SourceLoader' class, implementing the 2 methods, 'get_filename' and 'get_data'.

The 'get_filename' method receives the module name and returns the URL of the module, which (the URL) is passed into the 'get_data' method, which returns the contents of the URL.

It will be straightforward to also write a source loader that accesses a database or whatever.


4-3: Although There Is a Built-In File Source Loader Class . . .


Hypothesizer 7
If the source is an operating system file, the built-in 'importlib.machinery.SourceFileLoader' class could be used.

However, that class takes the module name and the file path in the constructor.

Huh? The module name and the file path are fixed for the class instance? Does that mean that we have to create an instance per module? . . . It seems to mean so.

That is an odd design that goes against the intention of the parent class: the 'get_filename' method is totally rendered meaningless that way.

If you want one that does not need to be instantiated per module, it will be easy to create it yourself.


References


<The previous article in this series | The table of contents of this series | The next article in this series>