• OpCodes, MSIL and Reflecting .NET code.

    by  • February 24, 2012 • .Net, Programming • 1 Comment

    Extracting the raw MSIL code and OpCodes that make up a .NET Assembly isn’t impossible but its not that straight forward.

    Having access to the MSIL is great to see what is going on under the hood in .NET. If you want to know more I would strongly suggest getting CLR via C#, this is one of my favourite books on the subject.

    You could just go and download a copy of the .NET reflector but isn’t it more fun poking around under the hood yourself?

    Firstly you need to open a .NET Assembly and find all the methods contained within. Usually you will want to load in a separate assembly but you can also use the assembly of the application itself. For all this you will need to add using references to the System.Reflection namespace.

    Assembly assembly = Assembly.GetExecutingAssembly();
    

    For more information on loading separate assembles, you can read my article on .NET plugin architectures.

    What we really want is the MethodBody‘s in the Assembly, now these are stored as Types, though these need to be sorted to see which contain method bodies. If you already know the method you are looking for you could use the following code, substituting the Type and Method name.

    MethodInfo mi = typeof(TestMethod).GetMethod("TesterMethod");
    MethodBody mb = mi.GetMethodBody();
    

    The long way round is to iterate through all the Types in the assembly and checking which ones have methods, this allows a diagnostic program for example to go through everything.

    
    foreach (Type type in assembly.GetTypes())
    {
        foreach (MethodInfo methodInfo in type.GetMethods())
        {
            MethodBody body = methodInfo.GetMethodBody();
            if (body != null)
            {
                // We now have access to Metadata and MSIL in the Method
            }
        }
    }
    

    From here we can access the MSIL code which makes up the method. This is all the OpCode data needed to understand and start to disassemble the method.

    OpCodes are stored as field variables in the System.Reflection.Emit class. An issue with this is that it makes it harder to iterate over them to find the relevant OpCode which represents the byte value in the data array. Using conditional lambda functions in LINQ we can solve this issue a lot easier.

    /// <summary>
    /// Returns the OpCodes represented by a MSIL byte array.
    /// </summary>
    /// <param name="data">MSIL byte array.</param>
    /// <returns>An array of the OpCodes representing the MSIL code.</returns>
    public static OpCode[] GetOpCodes(byte[] data)
    {
        List<OpCode> opCodes = new List<OpCode>();
    
        foreach (byte opCodeByte in data)
        {
            try
            {
                // Use LINQ to convert the byte to the relevant OpCode field. This is
                // rather extremely inefficient, a lookup table should be preferred.
                OpCode op = (OpCode)typeof(OpCodes).GetFields().First(
                    t => ((OpCode)(t.GetValue(null))).Value == opCodeByte).GetValue(null);
                opCodes.Add(op);
            }
            catch (Exception)
            {
                // OpCode not found
            }
        }
    
        return opCodes.ToArray();
    }
    

    To use this function we simply call it with the IL from the method.

    GetOpCodes(body.GetILAsByteArray())
    

    To show this is action you can add some Console output. If you are lazy, check out my example code file.

    Intercepting calls to Satellite .NET DLLs – Man in the Middle attacks.

    After looking at MSIL code and being able to look deeper into a .NET assembly. There is the possibility of using this functionality to intercept calls to a satellite DLL, and possibly altering or storing the parameters. What I mean can be summed up by the diagram below:

    What I am trying to build is an automated tool which generates a clone of a DLL with the same namespace, module and method signatures. The main application calls this DLL instead of the original, the parameters can be edited or stored, and then the original method in the originating DLL is called. Effectively creating a ‘Middleman’ shim.

    So far I am scanning through the original assembly creating the C# code required, which is being output to the Console. While producing the code, we skip over auto produced methods such as GetType() made by the compiler.

    public static void BuildIntercepter(Assembly assembly)
    {
        Console.WriteLine("namespace {0}\n{{", assembly.GetName().Name);
    
        // Get the name of all assemblies referenced apart form mscorlib (System.Core)
        var assembies = assembly.GetReferencedAssemblies()
             .Where(assem => !assem.Name.Equals("mscorlib"))
             .Select(assem => assem.Name);
    
        // Add all the using statements.
        foreach (var assem in assembies)
        {
            // Only use the higher level part of the assembly
            string usingAssem = assem;
            if (usingAssem.IndexOf('.') != -1)
            {
                usingAssem = usingAssem.Substring(0, usingAssem.LastIndexOf('.'));
            }
    
            Console.WriteLine("\tusing {0};", usingAssem);
        }
        Console.WriteLine();
    
        // Only iterate through exported public Types.
        foreach (Type type in assembly.GetExportedTypes())
        {
            // Only search for methods that originate from the Assemblies associated modules.
            if (type.GetMethods().Count(m =>
                !m.IsVirtual &&
                !m.IsAbstract &&
                assembly.GetModules().Contains(m.Module)) > 0)
            {
                Console.WriteLine("\tpublic class {0}\n\t{{", type.Name);
                foreach (MethodInfo methodInfo in type.GetMethods())
                {
                    // Ignore particular method attributes.
                    if (!methodInfo.IsVirtual &&
                        !methodInfo.IsAbstract)
                    {
                        // Ignore methods that have no method body
                        MethodBody body = methodInfo.GetMethodBody();
                        if (body != null)
                        {
                            // Build the parameters that go to the original function.
                            StringBuilder paramBuilder = new StringBuilder();
                            foreach (ParameterInfo parameter in methodInfo.GetParameters())
                            {
                                paramBuilder.Append(parameter.ParameterType + " " + parameter.Name + ", ");
                            }
    
                            Console.WriteLine("\n\t\tpublic {0} {1}({2})\n\t\t{{\n\t\t}}\n",
                                methodInfo.ReturnType.FullName,
                                methodInfo.Name,
                                paramBuilder.ToString());
                        }
                    }
                }
                Console.WriteLine("\t}\n");
            }
        }
    
        Console.WriteLine("}");
    }
    

    Thus scanning over the assembly so far creates the relevant C# code:

    I know, I know, I could use Reflection.Emit to generate the output MSIL that corresponds to the new assembly, thus programmatically generate the shim DLL but by just writing out source code, it is easier for the programmer to modify it.

    What I am doing next will be dynamically invoking the original methods in the middleman shim. Thus wiring up everything correctly so the calling Assembly won’t (hopefully) know any better.

    It should be noted that this style attack only works if there is limited checks on the Satellite Assembly (or it hasn’t been strong signed) or importantly it hasn’t been obfuscated, which can be easily done by the free Dotfuscator tools built into Visual Studio.

    About

    Software engineer. Tea drinker

    http://MrPfister.com

    One Response to OpCodes, MSIL and Reflecting .NET code.

    1. Pingback: Man in the middle C# Attacks | Mr Pfisters Random Waffle

    Leave a Reply

    Your email address will not be published. Required fields are marked *